Skip to content

Copy an Azure Data Factory pipeline to Synapse Studio

Reading Time: 4 minutes

In this post I want to share an alternative way to copy an Azure Data Factory pipeline to Synapse Studio. Because I think it can be useful.

For those who are not aware, Synapse Studio is the frontend that comes with Azure Synapse Analytics. You can find out more about it in another post I did, which was a five minute crash course about Synapse Studio.

By the end of this post, you will know one way to copy objects used for an Azure Data factory pipeline to Synapse Studio. Which works as long as both are configured to use Git.

Azure Data Factory example

For this example, I decided to use the pipeline objects that I created for another post. Which Azure Test Plans example for Azure Data Factory. It uses a mapping data flow, as you can see below.

Data flow in Azure Data Factory
Data flow in Azure Data Factory

In order to use this method both Azure Data Factory and Azure Synapse Analytics need to be setup to use source control. For this demo I have them stored in Azure Repos within Azure DevOps.

However, they can just as easily be in a GitHub Enterprise repository instead.

Copy an Azure Data Factory pipeline to Synapse Studio

How I did the copy was very simple. I just copied all the individual objects from the Azure Data Factory repository to the Azure Synapse repository using the same structure.

Below are the required objects I needed for the pipeline in the Azure Data Factory repository. Which are the linked services, datasets, the data flow and of course the pipeline itself. Shown as separate json files.

Azure Data Factory repository objects
Azure Data Factory repository objects

I copied the json files from the Azure Data Factory repository to the same locations in the Azure Synapse workspace repository, as you can see below. Making sure they went into the same branch that I was working on in Synapse Studio.

Same objects in Azure Synapse repository
Same objects in Azure Synapse repository

You can see that there are some extra objects in the Azure Synapse repository. Which get added by default when you connect an Azure Synapse workspace to a Git repository within Synapse Studio.

One key point is that it does this even if you do not select the option to import existing resources into Git.

Testing in Synapse Studio

Now, copying the Azure Data Factory objects this way is all well and good but does it work?

Well to test this thoroughly I recreated the two Azure SQL Databases that were used in the initial Data flow. With the source database based on the AdventureWorksLt sample database and the other database blank.

Afterwards, I opened up Synapse Studio and went to the Manage hub. Where I changed the Linked services for the two databases to connect to the new ones.

Linked services in the Manage hub
Linked services in the Manage hub

Once I had done that, I went into the Develop hub in Synapse Studio. I then opened the new Data flow and enabled Data flow debug.

I then tested the connection to the dataset, as you can see below. In addition, I was able to preview the data.

Data flow in Synapse Studio after copying an Azure Data Factory pipeline to Synapse Studio
Data flow in Synapse Studio

Afterwards, I went to the Integrate hub. From there I ran the pipeline in Synapse Studio by clicking Debug. Which succeeded, as you can see below.

Pipeline in Synapse Studio after copying an Azure Data Factory pipeline to Synapse Studio
Pipeline in Synapse Studio

To be absolutely sure I went into the Azure SQL database that was used for the destination (aka sink) in the Azure Portal. To help with some syntax here, in pipelines and data flows the destination is called sink.

I then logged into the Query editor and ran the below query. To make sure that new rows were in the database.

Checking rows existed using Query editor after copying an Azure Data Factory pipeline to Synapse Studio
Checking rows existed using Query editor

Which confirms that this method worked. Because that database was blank before we ran the pipeline.

DataWeekender lightning talks

In reality, something simple and effective like this can be explained within ten minutes as a lightning talk. With this in mind, if you have something like this you want to share with the community feel free to submit a lightning talk session to DataWeekender v4.2.

I thought I better mention this since call for speakers is still open. You can get to the sessionize page by clicking on this DataWeekender v4.2 call for speakers link or on the image below.

DataWeekender v4.2
DataWeekender v4.2

Final words about copying an Azure Data Factory pipeline to Synapse Studio

I hope this post about an alternative way to copy an Azure Data Factory pipeline to Synapse Studio helps some of you.

I like this method. Because it shows a simple and effective way to copy objects from Azure Data Factory to Azure Synapse.

I discovered this whilst looking to create more Azure DevOps templates after a previous post. Which introduced Azure DevOps templates for Data Platform deployments. So, expect more templates to appear in my GitHub site.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Published inAzure Data EngineeringAzure Data FactoryAzure Synapse Analytics

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *