In this post I want to cover spreading your SQL Server wings with dedicated SQL Pools. As part of a series of posts about spreading your SQL Server wings with the Microsoft Intelligent Data Platform.
Since I covered Azure Synapse Analytics in my previous post I thought I would go a bit deeper into dedicated SQL Pools. So that I can cover things in more detail, including CI/CD options.
By the end of this post, you will have a better understanding of dedicated SQL Pools. In addition, where your SQL Server background can prove to be useful. Along the way I also provide plenty of links.
You can find out more about this series of posts in an introductory post I published about the series.
About dedicated SQL Pools
Dedicated SQL Pools can be used to provide long-term storage of data in Azure Synapse Analytics. Ideally large volumes of data.
For example, data that is suitable for a Datawarehousing solution. Due to the fact that it has been optimized to store and query large volumes of data.
In reality, dedicated SQL Pools originates from a long line of various other Microsoft applications.
Including Parallel Data Warehouse and Azure SQL Datawarehouse. Rather interestingly, there appears to be an icon for Azure SQL Datawarehouse still within the latest Azure architecture icon set.
Multiple DMV’s appear to reference Microsoft’s Parallel Data Warehouse (PDW) within dedicated SQL Pools. As you can see in the below example.
-- Find top 10 queries longest running queries -- Note dmv name slightly different SELECT TOP 10 * FROM sys.dm_pdw_exec_requests ORDER BY total_elapsed_time DESC;
Dedicated SQL Pools uses what is known as Massively Parallel Processing (MPP) to work with data. Basically, when you work with data behind the scenes that data can be worked on across multiple nodes. So that it can process the data faster.
I recommend that you read the Microsoft guide about the Synapse SQL architecture components. Since it contains a diagram that helps you visualize how this works.
Working with dedicated SQL Pools
Dedicated SQL Pools are now integrated within Azure Synapse Analytics. You can work with them in various places within Synapse Studio.
For instance, in the Manage hub you can create and change them. Once created, you can navigate to them and drill down into their objects within the Data hub.
Plus, you can work with SQL scripts in the Develop hub.
In addition, you can manage Azure Synapse Link for SQL in the Integrate hub.
For those wondering, you can use Azure Synapse Link for SQL to load data into dedicated SQL Pools. Something I have covered in multiple posts. Plus, it happens to be another solution within the Microsoft Intelligent Data Platform.
Outside of Synapse Studio
You can also work with dedicated SQL Pools outside of Synapse Studio using various applications. Including SQL Server Management Studio and Azure Data Studio.
As long as you have the right network connectivity and permissions you can work with these applications by connecting to the dedicated SQL Pool endpoint. Which you can find in the Azure Portal within the overview section of Azure Synapse Analytics.
SQL Server background
Your SQL Server background can prove be very useful when working with dedicated SQL Pools. Due to the large amount of T-SQL that it supports.
I know this probably surprises some since it is a different product in the cloud. To help give you a better idea, below are some of the T-SQL functionality you can use with dedicated SQL Pools.
- Dynamic Data Masking
- Column-level encryption
- Transactions (Begin Tran, Throw and RaisError)
- Typical DML commands (Select, Insert, Update, Delete & Merge)
In addition to the above there’s T-SQL syntax you can use which is solely for dedicated SQL Pools.
For example, in dedicated SQL Pools you can specify how you want the data distributed for the tables behind the scenes. Which are known as distributed tables.
There are three types of distribution to choose from:
- Heap – Which is the same as a heap in SQL Server
- Hash – Spreads the data via a hash function. Ideally suited for large tables.
- Round-robin – Spreads the data evenly across all distributions behind the scenes. Regardless of their values.
You can specify the distribution type as part of your Create Table statement.
It is worth noting as well that most data types are supported within dedicated SQL Pools. However, there are some exceptions. You can find out more within the Transact-SQL features supported in Azure Synapse SQL documentation online.
CI/CD for dedicated SQL Pools
In reality, you can perform CI/CD for dedicated SQL Pools the same way as you do for SQL Server databases. Using either a state-based or a migration-based approach.
One key difference is that you must connect to the dedicated SQL Endpoint for your deployments. Instead of a server name.
To help with some jargon here, a state-based database deployment is where you deploy updates based on an entire state of something. Whereas a migration-based deployment is where you tend to apply incremental updates.
For example, for state-based deployments you can create a database project which essentially contains the schema of your database in code form. You can then create what is known as a dacpac file.
Which is a compressed file that contains files that are based on the contents of the database project. You can use the dacpac file to deploy to one of more databases within dedicated SQL Pools.
In reality, there are various ways you can create a dacpac file and deploy it within both Azure DevOps and GitHub. Below is an example of a completed workflow within GitHub.
I showed a couple of ways to perform state-based deployments in the January 2023 edition of the Azure Synapse Analytics and Microsoft MVP series. In addition, I have shared various posts and have made templates available on my GitHub site.
I want to add one final thing in relation to dacpacs. If your pipeline involves the use of SqlPackage it is worth upgrading it on your Azure Pipelines Agent or your GitHub Actions Runner.
Doing this will enable support for the DW_COMPATIBILITY_LEVEL database scoped configuration setting. So that you can perform CI/CD for tables that have Multi-Column Distribution (MCD) enabled.
I show a couple of examples on how to do this in a post I wrote, Which showed how to install SqlPackage to work with dacpacs for serverless SQL Pools.
If you prefer a migration-based deployments, you can look to use the DBUp .NET Framework along with the dbops PowerShell module. Which will work for both Azure DevOps and GitHub deployments.
In addition, it is worth noting that you can also use premium tools to do migration-based deployments. Just be aware that they come with a cost.
I hope this post about spreading your SQL Server wings with dedicated SQL Pools has proved to be useful.
Because I wanted to show just how much your SQL Server background can prove to be useful when looking to work with them. Even when you look to perform CI/CD.
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] I covered dedicated SQL Pools in my last post I thought I it was only fair that I covered serverless SQL Pools. So that I can cover things in […]