In this post I want to cover how you can create Data Pipeline tests with GitHub Copilot in Visual Studio Code. In order to test Microsoft Fabric Data Pipelines.

I want to do this post to help others realize what is possible when working with GitHub Copilot in Visual Studio Code.
Plus, highlight the fact that it can save a lot of development time. Because the way we work is changing and offerings such as GitHub Copilot enable us to be more productive.
Along the way I share plenty of links.
Data Pipeline example
For the benefit of this post I worked with the Git repository that I created in a previous post. Where I shared how to automate testing Microsoft Fabric Data Pipelines with YAML Pipelines.
I focus on my customized “Run Hello World” data pipeline. Which contains the default “Run notebook” activity plus a copy data activity that I created. Which downloads the sample NYC Taxi data to a folder in the Lakehouse. Specifying the created parameter as the folder location.

If you wish to follow along with this post, you need the following:
- At least one Microsoft Fabric workspace with Microsoft Fabric Git integration configured.
- Git repository stored in Azure Repos service in Azure DevOps.
- Git installed locally.
- Visual Studio Code installed locally with GitHub Copilot setup.
- A clone/copy of my AzureDevOps-fabric-cicd-with-automated-tests GitHub repository.
Knowledge about the Data Factory Testing Framework will be beneficial if you intend to create your own Data Pipeline tests.
Create single Data Pipeline test with GitHub Copilot in Visual Studio Code
To create Data Pipeline tests with GitHub Copilot I first opened up the Git repository that contains the metadata for the workspace in Visual Studio Code. I then navigated to the existing test file that I had created for my previous post.
Once I had opened the file I clicked on the GitHub Copilot icon at the top of Visual Studio Code and selected Open Chat.

After doing this the “Ask Copilot” window appeared.

First of all, I needed to provide more context for what I wanted. So, I clicked on “Add context” in the chat and selected the “pipeline-content.json” metadata file stored in my data pipeline folder.
This is where my prompt engineering skills were put to the test. Because I then asked Copilot the below.
Can you create a new test file written in Python that is similar to the current file which uses the data_factory_testing_framework Python library to check that the destination type in the “Copy sample data” is a Lakehouse?
Below is the response I got, which I opened in a new window for visibility.

As you can see, as well as suggesting the code it also provided me with the option to create the “test_lakehouse_destination.py” file. So, I clicked on the option to create the new file and saved it.
To keep the test simple, I went into my YAML pipeline file and updated the relevant PowerShell task to work with the new test file, as below.
- task: PowerShell@2
displayName: 'Run sample pipeline test'
inputs:
targetType: 'inline'
script: |
pytest Tests\test_linked_service_type.py --junitxml=simple-pipeline-test-results.xml
pwsh: true
I then saved the changes to the file for my YAML pipeline and committed all the updates to my local Git repository. When prompted, I synchronized the change to my Azure DevOps repository.
Data Pipeline test results in Azure DevOps for simple test
When I ran the YAML pipeline in Azure DevOps the three stages completed like in the below example.

I clicked on the “Tests Data Pipeline” stage and changed the filter to “Passed” which allowed me to see more information.

Of course, my testing would not be complete without testing for failure. So, I changed the destination in the Data Pipeline to be a Data Warehouse instead.
Doing this caused my test to fail and my YAML pipeline to stop at my testing stage. However, because I had configured the condition for the publish task to always, I was still able to view the test results.

Multiple Data Pipeline tests with GitHub Copilot in Visual Studio Code
GitHub Copilot was able to create a test file for one pipeline with ease. But what about a real-life scenario for multiple pipelines?
I decided to create nine copies of my Data Pipeline in my workspace to test another real-world scenario.
Which was to check that the Copy data task in all my Data pipelines were activated. Because in the real-world you want to know if an activity was inadvertently deactivated as soon as possible. Especially when working in a workspace with multiple Data Pipelines.
So, I created all the Data Pipelines in Microsoft Fabric, committed the change back to my Git repository and synchronized the changes locally in Visual Studio Code.
I then went back to Ask Copilot in Visual Studio Code whilst I had my Python test file opened. First, I clicked on “Add Context” again and added one of the “pipeline-content.json” files stored in one of the Data Pipeline folders.
I was then ready to ask Copilot another question. Admittedly it took a few attempts to get a test that worked properly. Finally, I got a working script by asking the below question:
Can you create a new test file written in Python that first loops through to identify all the Data Pipelines in my workspace folder and then uses the data_factory_testing_framework Python library to loop through all the Data Pipelines in my workspace folder by name and check that all the activities that have a type of “Copy” do not have a state value of “Inactive”?
Copilot returned the below for me.
It also provided the below explanation for me as well.
Certainly! Here’s a Python test file that:
- Discovers all Data Pipeline folders in your workspace directory (looks for folders ending with
.DataPipeline
).- Loads each pipeline’s JSON (pipeline-content.json).
- Checks all activities of type
Copy
to ensure theirstate
is not"Inactive"
.- Uses
pytest
for test structure anddata_factory_testing_framework
for any helpers (though the main logic is standard Python).Place this file in your Tests folder (e.g.,
Tests/test_pipeline_copy_activity_state_not_inactive.py
):import os import json import pytest from data_factory_testing_framework import DataPipelineTestCase # Optional, for future extension WORKSPACE_DIR = os.path.join(os.path.dirname(os.path.dirname(__file__)), "workspace") def find_pipeline_jsons(workspace_dir): """Yield (pipeline_name, pipeline_json_path) for all DataPipeline folders.""" for entry in os.listdir(workspace_dir): if entry.endswith(".DataPipeline"): pipeline_dir = os.path.join(workspace_dir, entry) pipeline_json = os.path.join(pipeline_dir, "pipeline-content.json") if os.path.isfile(pipeline_json): yield entry, pipeline_json @pytest.mark.parametrize("pipeline_name,pipeline_json_path", list(find_pipeline_jsons(WORKSPACE_DIR))) def test_copy_activities_not_inactive(pipeline_name, pipeline_json_path): with open(pipeline_json_path, "r", encoding="utf-8") as f: pipeline = json.load(f) activities = pipeline.get("properties", {}).get("activities", []) for activity in activities: if activity.get("type") == "Copy": state = activity.get("state", None) assert state != "Inactive", ( f"Pipeline '{pipeline_name}' has a Copy activity '{activity.get('name')}' with state 'Inactive'." )
How it works:
- The test will automatically run for each pipeline in your workspace.
- It will fail if any Copy activity has
"state": "Inactive"
.Requirements:
pytest
must be installed.data_factory_testing_framework
is imported for future extensibility, but not strictly required for this check.Let me know if you want to extend this to use more features from
data_factory_testing_framework
!
As you can see, Copilot clearly stated that the Data Factory testing Framework was not required for this particular test. So I removed the reference to it in the suggested code.
I added the script to my Git repository and updated the file for my YAML pipeline to work with the new test instead.
- task: PowerShell@2
displayName: 'Run sample pipeline test'
inputs:
targetType: 'inline'
script: |
pytest Tests\test_pipeline_copy_activity_state.py --junitxml=test-results-all-pipelines.xml
pwsh: true
- task: PublishTestResults@2
displayName: 'Publish Pipeline Test Results'
inputs:
testResultsFormat: 'JUnit'
testResultsFiles: '**/test-results-*.xml'
testRunTitle: 'Data Pipeline Tests'
condition: always()
Data Pipeline test results in Azure DevOps for multiple Data Pipelines
When I ran the YAML pipeline in Azure DevOps the three stages completed. I clicked on the “Tests Data Pipeline” stage and changed the filter to “Passed” to allow me to see more information about the passed tests.

Of course, my testing would not be complete without testing for failure. So I deactivated the Copy activity in two of the Data Pipelines and then run the YAML pipeline in Azure DevOps again.
This time the pipeline failed. I clicked on the “Tests Data Pipeline” stage and changed the filter to “Failed” to allow me to see more information about the failed tests.

Final words
I hope that by this post on how to create Data Pipeline test with GitHub Copilot in Visual Studio Code inspires some of you to think about the testing possibilities with Copilot.
One thing I must stress is that you need to be good at prompt engineering for more complex scenarios. Plus, it can take a few attempts to get the desired results.
Luckily, there is a Copilot Chat Cookbook available to help. A special thanks to Thomas Thornton for sharing the details about that just after this post was published.
Of course, if you have any comments or queries about this post feel free to reach out to me.
Be First to Comment