This post is the second in a series of posts about how Azure Test Plans can be used for deployments to various Data Platform services. My first one in this series was an Azure Test Plans example for Azure Data Factory.
I decided to do this series to give others gain ideas of how to use Azure Test Plans and to introduce some to features in Azure Data services. Plus, I promised a Data Platform MVP that I would do a Power BI example, which will conclude this series.
Overview about Azure Synapse Analytics
Just a quick recap, Azure Synapse Analytics is an integrated Data service available in Azure. Its aim is to make Data ingestion, Big Data analytics and Data Warehousing available in one place. Instead of multiple services.
For example, in the past you would ingest data using Azure Data Factory, perform analytics by using Azure Databricks and use Azure Datawarehouse for your Data Warehousing needs. Now you can do all three and more within Azure Synapse Analytics.
For instance, you can create pipelines in Azure Synapse Analytics to ingest data. In addition, you can perform Analytics by working with Spark Pools and use SQL Pools for your Data Warehousing needs.
In fact, those of you with a SQL Server background will find that working with SQL Pools very familiar. Because it supports a lot of the T-SQL language, as you will see in this post. You can read the documentation to see the full list of Transact-SQL features supported in Azure Synapse SQL.
Another good thing about this service is that it’s easier to transfer data between your Spark and SQL Pools if they are within the same Azure Synapse workspace.
In addition, you can also manage Power BI reports and dashboards within Azure Synapse Analytics as well.
Azure Synapse Analytics provides Synapse Studio which you can use to manage all the above. You can find out more about Synapse Studio in a previous post I published which was a five-minute crash course about Synapse Studio.
Azure Test Plans example for Azure Synapse Analytics
In this example I use a requirements-based test suite with Azure Synapse Analytics. For this example, I am going to use the GitHub repository I created in another post. Which was about creating a dacpac for an Azure Synapse Analytics dedicated SQL Pool using GitHub Actions.
In reality, this can be any database project that you have made a copy of. I just wanted to remind everybody that you can use Azure Test Plans for items in GitHub.
For this example, I want to convert this database project so that it can be used as a Data Warehouse in a dedicated SQL Pool. In addition, the Data Warehouse has to be deployed to development, staging and a production environment.
One way a Product Owner can do this is within Azure Boards. By creating a Feature for the Data Warehouse. From there, the Feature can contain three separate Product Backlog items for the three environments. Of course, if the Data Warehouse was more complex it can be a Feature instead.
Note that the three Product Backlog Items shown above all have a Data warehouse tag added to them. Because this tag is needed later in this post.
Creating requirement-based suite in Azure Test Plans
I first create a new test plan in Azure Test Plans. From there I start creating a requirement-based suite by going into the new test plan and clicking on the ellipses. From there I drill down until I get the option to create a requirement-based suite.
Once I select the requirements-based suite option a query designer appears. Which may look familiar to those of you who have worked with query designers before.
In the designer I add an additional filter to the query. So that it looks for all work items that contain the tag ‘Data warehouse’. I then run the query.
Once I am happy with the results, I select all the results and then click the ‘Create suites’ button. It then returns me to the test plan with new requirements-based suites.
From here it gives me a blank canvas to create one or more test cases in each test suite. I like to think of it as a blank canvas because of the image used when inviting you to add a test case.
On a side note, if you do not like the order of the test suites returned you can rearrange them by using drag and drop.
Copying a test case
One useful thing I can do is create a test case in one test suite and then copy it to the other ones to save some time.
For example, say I have created the below test case.
In reality, you should use unit tests to check the existence of objects like the schema and tables instead. However, I have kept them in this test case to use as an example.
You can read more about this in unit tests another of my posts about doing unit tests for an Azure Synapse Analytics dedicated SQL Pool.
As you can see above, some of the tables use different distribution types. Because even though you can use T-SQL to query dedicated SQL Pools the architecture for them is different from SQL Server. Behind the scenes the compute and storage are spread to improve the performance.
You declare how the data for the tables is spread amongst the nodes behind the scenes by declaring a distribution type for the table. Microsoft has documentation on how to design tables using dedicated SQL pool in Azure Synapse Analytics.
Another step in the above test case which is not visible is to check that the data in the DimCustomer table is anonymised. Because ideally Personal Identifiable Information (PII) should not be visible in your development environment.
Copy a test case in Azure Test Plans
From here I can copy a test case for use in another test suite. Doing this can save you time if you have test cases in different suites which will be similar.
In this example, I can copy my test case for development by clicking the ellipsis next to the test case and select ‘Copy test cases’.
From there I can fill out the details to copy the test case as below.
Afterwards, the test case is copied to the root of the test suites. Where I can move it to another test suite.
For this test plan, I copied the test case to the other two environments. From there I changed the test cases for their different needs.
For example, in the production test suite I renamed the test case. I then replaced the step to check for anonymised data with a step that checks for Dynamic Data Masking instead.
I did this because in dedicated SQL Pools you can use a lot of the same security features that you can use with Azure SQL database. Like Dynamic Data Masking and row-level security. For those with a Snowflake background, Dynamic Data Masking in dedicated SQL Pools is similar to the Dynamic Data Masking feature in Snowflake.
You can see which security feature are available in the Security section of the T-SQL feature guide I mentioned earlier in this post.
Check the test case results for Azure Synapse Analytics
I can check the test case results for these dedicated SQL Pools in Azure Synapse Analytics in a number of places depending on what tests I do.
For instance, to check that the deployment ran OK in GitHub Actions I can use GitHub CLI or go to the Actions tab in GitHub as below.
To check the results for the dedicated SQL Pool I have a few different options. Including running the queries in an application like Azure Data Studio or within Synapse Studio. For the purposes of this example, I decided to use Synapse Studio as you can see below.
When I go to the Data Hub in Synapse Studio, I can see that all three pools exist.
However, to check the results of the scripts in the test cases I have to run them in the Develop hub.
I can rename and save the SQL scripts that I create in the Develop hub. Which means that I can reuse them in different pools by using the drop-down menu if they use the same workspace.
Doing this can be a very efficient thing to do.
In addition, I can add Power BI reports in the Develop hub as well. Just in case I need to check that Power BI reports work in any of the environments.
In fact, I recently found out something interesting about linking Power BI workspaces to Synapse Studio as well. Which I will cover in another post.
One key point to remember if you do this is that it is common to have a different Synapse workspace for each environment. So, the different SQL Pools could be in different workspaces.
For example, to the Development, Staging and Production environments in this post might be in totally separate workspaces in different Azure subscriptions.
To keep all your workspace objects like SQL scripts synchronized between them you can implement Continuous integration and delivery for Azure Synapse workspace. Doing this allows you to only have to update one workspace and have the others updated automatically using deployment pipelines.
However, bear in mind that this only works for artifacts inside your workspace. It does not cover the schemas of your SQL Pools. I cover one way you can synchronize the schema in different SQL Pools in a previous post. Where I covered creating a dacpac for an Azure Synapse Analytics dedicated SQL Pool using GitHub Actions.
More about Azure Synapse Analytics
If you want to learn more about Azure Synapse Analytics, Microsoft Learn has various learning paths for you. Including one about how to work with Data Warehouses using Azure Synapse Analytics.
It’s a good idea to go through the other learning paths as well though to discover what else you can do.
For example, those of you with a SQL Server background can find out more about Apache Spark by going through another learning path. Which is called perform data engineering with Azure Synapse Apache Spark Pools.
I hope this Azure Test Plans example for Azure Synapse Analytics has been an interesting insight.
Because, I had two aims for this post. To give others’ ideas about how Azure Test Plans can be used with Azure Synapse Analytics and to introduce some to parts of Azure Synapse Analytics.
Of course, if you have any comments or queries about this post feel free to reach out to me.