Migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Reading Time: 8 minutes

In this post I want to cover one way you can migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse using Azure DevOps.

I thought I better do this post since I showed how to do something similar for serverless SQL Pool objects in a previous post.

To clarify, when I say dedicated SQL Pools I mean the ones that are part of Azure Synapse Analytics. Which I covered in a previous about spreading your SQL Server wings with dedicated SQL Pools.

For those looking to find out more about Microsoft Fabric and how to get started with it I recommend reading one of my previous posts. Where I covered how you can create your own Microsoft Fabric environment.

By the end of this post, you will know how to migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse using Azure DevOps. Along the way I share plenty of links and advice.

One key point to note is that Microsoft Fabric is now generally available. You can read more about this in detail in the official post by Ryan Majidimehr.

Managing expectations

To manage expectations, this post only covers database schema objects. Plus, I need to highlight the fact that this solution has some interesting quirks. Some of which I highlight in this post.

Even though there are some quirks, I still want to show this solution. So that others can see it working and I can highlight a few important facts. Plus, share a template you can use to test this yourself.

I also want to point out that I use a YAML pipeline within Azure DevOps for this post. I did this in order to make the code publicly available for you all to work with.

You can download an example of code used in this post by going to the AzureDevOps-dedicatedSQLPoolToFabricDW GitHub repository. Which I made put in GitHub a couple of days ahead of this post.

My advice is to import this repository into Azure Repos and then connect to it in Azure Pipelines. I cover how to do this in a post I wrote about how to connect a Database Project in Azure Repos to Azure Pipelines.

Prerequisites for this example

For this post I created a dedicated SQL Pool. Which creates a database with the same name in it by default.

In addition, I used the same Data Warehouse that I used in my previous post about migrating serverless SQL Pools. As you can see below.

I also have created a YAML pipeline in Azure Pipelines within Azure DevOps. So that I can perform CI/CD. Which allows me to the migration either once or constantly.

In order to help you visualize how this works I created the below diagram.

In order for the pipeline to work, it uses a YAML pipeline that is stored as a file in a Git repository within Azure Repos. Personally, I think this makes YAML pipelines more portable.

In addition, I am using a self-hosted Azure Pipeline agent. Which has the latest version of SqlPackage installed. Plus, it has the latest version of the .NET SDK installed. For reasons you will read about later on.

You can read more about some of the options to work with SqlPackage within pipelines in a blog post I wrote. In that post I show how to install SqlPackage to work with dacpacs for serverless SQL Pools.

Migrating serverless SQL Pool objects to Microsoft Fabric Data Warehouse

As you will see below, the code I wrote is different to what I used in my post about serverless SQL Pools.

I first extracted the database objects into separate SQL files. Organized into different folders for each schema and object type in the database. I was able to achieve this thanks to the ExtractTarget property for the extract option.

      steps:
      # First extract the files from the database
      - task: PowerShell@2
        inputs:
          targetType: 'inline'
          script: |
            # PowerShell to extract contents of a database in a serverless SQLPool into files
            
            # SqlPackage /Action:Extract /TargetFile:$(Build.ArtifactStagingDirectory)\$(TargetFile) /p:VerifyExtraction=true /SourceServerName:$(SourceSQLPoolEndPoint) /SourceDatabaseName:$(SourceDB) /SourceUser:$(SQLPooluser) /SourcePassword:$(SQLPoolpw)
            SqlPackage /Action:Extract /TargetFile:$(Build.ArtifactStagingDirectory)\$(TargetFile) /p:IgnoreExtendedProperties=true /p:ExtractTarget=SchemaObjectType /p:VerifyExtraction=true /SourceServerName:$(SourceSQLPoolEndPoint) /SourceDatabaseName:$(SourceDB) /SourceUser:$(SQLPooluser) /SourcePassword:$(SQLPoolpw)

You might notice that the SqlPackage command extracts the files to a filename and not a folder name. Rather interestingly, this is because it appears that SqlPackage still requires a dacpac to be referenced when using this command.

However, I can confirm that it extracts the files to a folder that ends with ‘.dacpac’ instead of an actual dacpac file. Of course, this is done behind the scenes on the Azure Pipelines agent.

Afterwards, the pipeline runs another PowerShell task which renames the folder to something more sensible that does not end with ‘.dacpac’.

In addition, the below code removes any lines of code that starts with the word WITH. In order to remove all references to indexes and distribution types that can be created at the end of a SQL file for a table. As you can see below.

      # Next, remove references to indexes and distribution types from all SQL files
      - task: PowerShell@2
        inputs:
          targetType: 'inline'
          script: |
            Set-Location $(Build.ArtifactStagingDirectory)

            # To avoid any confusion rename the dacpac folder to the Target Folder name
            Rename-Item $(TargetFile) $(TargetFolder)

            cd $(TargetFolder)

            $SQLFiles = Get-ChildItem -Recurse -File *.sql | Select-Object -ExpandProperty FullName
            foreach($file in $SQLFiles){
              # Remove any line that starts with 'WITH'
                (Get-Content "$file") | Where-Object {$_ -notmatch '^\s*WITH'} | Set-Content "$file
              }

I had to this in order for the project to build properly with a target platform that could not only build properly, but deploy as well. As you will see next.

Creating a database project based on the dedicated SQL Pool objects

Once the above has finished, the pipeline runs another PowerShell task. Which creates a new database project in the root folder where the SQL files were extracted.


      - task: PowerShell@2
        inputs:
          targetType: 'inline'
          script: |
            cd $(Build.ArtifactStagingDirectory)\$(TargetFolder)
            dotnet new sqlproj -tp Sql150

What makes this database project interesting is that it is a SDK-style database project using the ‘Microsoft.Build.Sql‘ .NET SDK for database projects. In other words, a newer way to create efficient database projects.

Which I think is great. Because it creates a more efficient sqlproj file that contains less lines, as you can see below.

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build">
  <Sdk Name="Microsoft.Build.Sql" Version="0.1.10-preview" />
  <PropertyGroup>
    <Name>Fabric_HomemadeProject</Name>    <DSP>Microsoft.Data.Tools.Schema.Sql.Sql150DatabaseSchemaProvider</DSP>
    <ModelCollation>1033, CI</ModelCollation>
  </PropertyGroup>
</Project>

I much prefer this to the large sqlproj files that are created with the older method. For example, the sqlproj file that is part of my GitHub-SynapseDedicatedSQLPool repository.

Another benefit when creating a database project this way is that all the SQL files extracted in the subfolders are automatically recognized.

For example, if I have a SQL file in the ‘\dbo\Views‘ subfolder I do not have to add it to the sqlproj file in the root directory. Like with the other method. This is due to the fact that this type of project has a default globbing pattern that identifies SQL files.

You can find out more about the this default globbing pattern in the ‘Microsoft.Build.Sql‘ functionality page.

In addition, you can find out more about how to create database projects this way within the ‘Microsoft.Build.Sql.Templates‘ page in GitHub. Alternatively, you can view the details in nuget documentation for ‘MSBuild.Sdk.SqlProj‘.

Database Project target platform

Another key point to highlight about the database project is that the target platform for it is SQL Server 2019. Which was set with the ‘-tp Sql150’ option. You can view all the current target platform options for the sqlproj command by typing in the below code.

dotnet new sqlproj -h

After testing I discovered that using SQL Server 2019 target platform better for deployments than both the serverless and dedicated SQL Pool target platforms.

When I tried to build a database project with a serverless SQL Pool target platform it fails when tables are specified. Due to the fact that CREATE TABLE is not supported by build applications for the serverless SQL Pool target platform.

When I tried a project based on dedicated SQL Pools that includes tables it fails during deployments. I suspect due to compatibility issues.

In reality, I tested a various options to try and overcome the above obstacles before I identified the best target platform.

With this in mind, I am hoping that one of the below happens in the future relating to this.

Either the serverless target platform is updated in build applications so that they support tables or a new target platform becomes available specifically for Microsoft Fabric Data Warehouses.

In addition, I would also like a more graceful way to migrate the database objects ongoing.

Building the database project

I had to build my new project using a .NET core task instead of Visual Studio build task in Azure Pipelines. Due to the fact that I had created an SDK database project.

      - task: DotNetCoreCLI@2
        displayName: 'Build the dacpac using dotnet'
        inputs:
          command: 'build'
          projects: '$(Build.ArtifactStagingDirectory)\$(TargetFolder)\Fabric-HomemadeProject.sqlproj'
          arguments: '--configuration $(BuildConfiguration)'

Once done, I published the dacpac as an artifact in Azure DevOps.

      - task: PublishBuildArtifacts@1
        displayName: 'Publishes dacpac as an artifact'
        # Publishes the dacpac as part of an artifact within Azure DevOps
        inputs:
          PathtoPublish: '$(System.ArtifactsDirectory)\$(TargetFolder)\bin\$(BuildConfiguration)'
          ArtifactName: $(SQLPoolartifactname)
          publishLocation: 'Container'

YAML code to deploy the dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

After spending a reasonable amount of time testing various options the below code worked and I was able to deploy the contents of the dacpac to a Microsoft Fabric Data Warehouse.

- stage: FabricDW
  displayName: 'Deploy Datawarehouse'
  jobs:
    - deployment: 'FabricDW'
      displayName: 'Fabric Datawarehouse'
      environment: Production

      pool: 
        name: $(agentpool)

      strategy:
        runOnce:
          deploy:
            steps:
                  - task: DownloadBuildArtifacts@0
                    displayName: 'Download Artifacts'
                    inputs:
                      buildType: 'current'
                      downloadType: 'specific'
                      artifactName:  '$(SQLPoolartifactname)'
                      downloadPath: '$(System.ArtifactsDirectory)'

                  - task: SqlAzureDacpacDeployment@1
                    displayName: 'Install DACPAC on Fabric DataWarehouse'
                    inputs:
                      azureSubscription: $(AzureSubscription)
                      AuthenticationType: 'aadAuthenticationPassword'
                      aadSqlUsername: '$(aadSqlUsername)'
                      aadSqlPassword: '$(aadSqlpw)'
                      ServerName: $(DestSQLConnString)
                      DatabaseName: '$(DestinationDW)'
                      deployType: 'DacpacTask'
                      DeploymentAction: 'Publish'
                      DacpacFile: '$(System.ArtifactsDirectory)\$(SQLPoolartifactname)\$(Targetfile)'
                      AdditionalArguments: '/p:VerifyCollationCompatibility=False /p:AllowIncompatiblePlatform=True'

Even though it is similar to the code that worked for the deployment from a serverless SQL Pool there is one significant difference. I added the AllowIncompatiblePlatform SqlPackage Publish property to the AdditionalArguments option.

I had to do this to allow the dacpac to complete successfully. To verify that the pipeline worked, I checked that it had completed in Azure DevOps.

Completed pipeline in Azure DevOps after I migrated serverless SQL Pool objects to a Microsoft Fabric Data Warehouse — Completed pipeline in Azure DevOps

After checking that the pipeline worked in Azure DevOps, I looked at the below Data Warehouse in Microsoft Fabric and checked that the new view was there.

One thing I want to highlight about this method is that it is fine for deploying some objects multiple times. For example, views.

However, it will only allow you to deploy a table once. You will get an error like the one below if you try to deploy an update that refers to a table that already exists in the Microsoft Fabric Data Warehouse again.

There is already an object named 'Tablea' in the database

In reality, there are a few different ways to work around this. For example, you can simply migrate other object types instead of tables.

Another option is to configure the pipeline to delete all the tables in the Data Warehouse before deploying them again. I am not a big fan of this option. Which is why I did not put that in this example.

Final words about migrating dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

I hope this post about one way that you can migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse using Azure DevOps has been an interesting insight.

It took a while to figure out the best way to currently do this. Hopefully, future updates or a more graceful mechanism will allow for smoother options.

In the meantime, the AzureDevOps-dedicatedSQLPoolToFabricDW GitHub repository is available for you to download. So that you can test for yourselves and tweak as you see fit.

Of course, if you have any comments or queries about this post feel free to reach out to me.

One Comment

Prepare Azure DevOps for Microsoft Fabric Git integration - Kevin Chant

[…] Right now, you can run a pipeline in Azure DevOps to deploy schema updates to Microsoft Fabric Data Warehouse. Just like the method I showed in my other post about migrating dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse. […]

Migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Managing expectations

Prerequisites for this example

Migrating serverless SQL Pool objects to Microsoft Fabric Data Warehouse

Creating a database project based on the dedicated SQL Pool objects

Database Project target platform

Building the database project

YAML code to deploy the dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Final words about migrating dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Related

One Comment

Leave a Reply Cancel reply

Migrate dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Managing expectations

Prerequisites for this example

Migrating serverless SQL Pool objects to Microsoft Fabric Data Warehouse

Creating a database project based on the dedicated SQL Pool objects

Database Project target platform

Building the database project

YAML code to deploy the dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Final words about migrating dedicated SQL Pool objects to a Microsoft Fabric Data Warehouse

Share this:

Related

One Comment

Leave a Reply Cancel reply