Last week I did some initial tests for Azure Synapse Link for SQL Server 2022. Which is currently in preview.
In addition, I showed a working version of it at Data Saturday Croatia. With this in mind, I thought I would share some of my experiences with it so far.
By the end of this post, you will have an understanding of this feature and know one way to test that it is working.
About Azure Synapse Link for SQL Server 2022
Azure Synapse Link for SQL Server 2022 allows you to replicate your data from a SQL Server 2022 database to an Azure Synapse Analytics dedicated SQL Pool.
It is one of the options for the new Azure Synapse Link for SQL feature that was announced during Microsoft Build. You can read more about this in the Microsoft post which also announced the Public Preview of Azure Synapse Link for SQL.
It is the main reason why I installed SQL Server 2022 on a virtual machine. Which I covered in a previous post about creating a dacpac for a SQL Server 2022 database. However, since than SQL Server 2022 container images are now available. Which opens up some more options.
Azure Synapse Link for SQL Server 2022 recently came out of preview and is now generally available.
Setting up Azure Synapse Link for SQL Server 2022
I setup Azure Synapse Link for SQL Server 2022 by following the instructions by Microsoft. Which shows how to get started with Azure Synapse Link for SQL Server 2022.
Because I had SQL Server 2022 installed in a virtual machine I installed the integration runtime on there before creating the linked service.
I decided to test using the WideWorldImporters sample database. Because I was initially thinking about doing some tests using the WideWorldImporters data generation.
One key point to mention here is that I did not change the database compatibility level after the database restore.
If you are looking to do the same with this database here are some key points to remember
- You need to create a master key in the database. In addition, if you restore the database, you must add this key again.
- You must create all the schemas required in the destination dedicated SQL Pool.
- If you recreate the dedicated SQL Pool, remember to add the master key back again.
- You must make sure you use managed identity when you create a linked service to connect to your landing zone on Azure Data Lake Storage Gen2. If you authenticate using another method, you will not be able to select it whilst setting up Azure Synapse Link. I know this from experience.
Azure Synapse Link table recommendations
One thing I liked when I went to create the Azure Synapse Link for SQL Server 2022 connection was the fact that it highlighted potential issues with tables straight away after selecting the database as a linked service. As you can see below.
After selecting the database to be the Link connection I could manage it in the new ‘Link connection’ option.
Straight away it gave me recommendations to change some tables to be a heap.
Once I clicked on the Start button, I had to wait a while for the connection to start running. So, you might need to be patient here. One good thing about this is that you can drill down into the link to view the progress of each table by clicking on the link name.
It is very clear when Azure Synapse Link for SQL Server 2022 is up and running.
After it was up and running, I checked that it was using the right storage account and container.
Azure Synapse Link Storage Account
After it was running, I checked to make sure the right storage account was being used by Azure Synapse Link. Luckily there is an easy way to check. Because if you scroll to the end of your link connection you can find a continuous run id.
You can than navigate into your storage account and check that the folder name is the same as the run ID.
Testing automatic replication
One question I got asked at Data Saturday Croatia was if it Azure Synapse Link for SQL automatically updated. With this in mind I thought I would show what happens when an update takes place. So that you can see for yourself.
As you can see below the replication first started at 3:50.
In the storage account, there is an initial Manifest file. Which was created at 3:55.
If I go navigate further into the ‘Tables’ folder and go into one of its subfolders I can see a JSON file which contains the table details and a FullCopyData folder.
I downloaded the JSON file so that I could verify which table it was. I than updated the comments column in the PurchaseOrders table within the SQL Server 2022 database.
Here is where things get interesting, because a new folder called ‘ChangeData’ appeared. Plus, it looks like the new file is a CSV file. Whereas the table used for the initial load was a parquet file.
In addition, the Manifest JSON file got updated with a new ‘Last modified’ value.
To check that the link had done its job, I checked that my update in the SQL Server 2022 database had made its way to the dedicated SQL Pool by running the below query.
One thing I will add is that it is probably a good idea come up with your own method to load test Azure Synapse Link for SQL Server 2022. Especially if you intend to test by using the WideWorldImporters database.
Because there are issues if you try and use the ‘DataLoadSimulation.PopulateDataToCurrentDate’ stored procedure within the database after setting it up with Azure Synapse Link for SQL Server 2022.
Final words about initial tests
I hope this post about my initial tests for Azure Synapse Link for SQL Server 2022 has been a good introduction to this feature for some of you.
It is clear that a lot of effort has been made to make it easy to configure and monitor this new feature. However, make sure you understand its known limitations.
If you want to learn more about the Azure Synapse Link for SQL feature there are a couple of sessions relating to it at the next Data Toboggan conference. Including one by myself.
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] Kevin Chant gives Synapse Link for SQL Server a try: […]
[…] To set some realistic expectations about this post, it does not cover Azure Synapse Link for SQL Server 2022. You find out more about that in another post I wrote called ‘Initial tests for Azure Synapse Link for SQL Server 2022‘. […]
[…] I am not going to go into too much detail here Azure Synapse Link for SQL Server 2022. Just a general overview about my setup for this post. For further details please read my other post. Where I covered my initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] Because a while back I spotted something interesting whilst I was doing some initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] Because a while back I spotted something interesting whilst I was doing someĀ initial tests for Azure Synapse Link for SQL Server 2022. […]
Hi Kevin, thank you for that great insight!
I got a question here, for Synapse link from SQL DB, will it capture any updates or deletion of data from the source table?
Hi Keane
Short answer is yes, at least for SQL Server 2022. I have published a blog post about some of the things that happen internally as well:
https://www.kevinrchant.com/2022/08/30/updates-and-deletes-with-azure-synapse-link-for-sql-server-2022/
[…] Because recently somebody asked if Azure Synapse Link for SQL Server 2022 captures updates and deletes after they had read a previous post. Where I covered my initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] After doing some tests I published a post which covered my initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] Which are represented in the red squares in the below diagram. I cover Azure Synapse Link for SQL Server 2022 in more detail in another post. Where I cover my initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] You can find a recap about Azure Synapse Link for SQL Server 2022 in my post about my initial tests for Azure Synapse Link for SQL Server 2022. […]
[…] You can find a recap about Azure Synapse Link for SQL Server 2022 in my post about my initial tests for Azure Synapse Link for SQL Server 2022. […]