Skip to content

Testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabric

Reading Time: 7 minutes

In this post I want to share the results of my initial testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabric. Which is one of the capabilities offered as part of the Sustainability data solutions in Microsoft Fabric.

Please note: After this post was published a new ESG metrics capability was introduced, making a significant part of this post obsolete. There is now an updated post that covers a review of the ESG metrics capability in Microsoft Fabric after the Microsoft Ignite announcement.

To help with some jargon here, ESG stands for Environmental, Social and Governance. Which are three of the main topics that companies need to report about when reporting on sustainability.

Sustainability is an important topic at the moment. Especially in Europe where a lot of companies are due to report on sustainability metrics. Due to the Corporate Sustainability Reporting Directive (CSRD).

I am really excited to deliver this post. Because sustainability is something that was taught to me in my earlier years. Now I get to cover how to you can utilize Microsoft Fabric to report on it in this post.

By the end of this post, you will know the results of my initial testing of the new ESG data estate capability in Microsoft Fabric. Along the way I share plenty of links.

One key point I want to highlight is that the Sustainability data solutions in Microsoft Fabric are currently in preview and are subject to change. Another point I want to highlight is that I run the below with a trial capacity in my own Microsoft Fabric environment.

How I found out the ESG data estate capability was finally available in North and West Europe

In reality, the ability to deploy the Sustainability data solutions within Microsoft Fabric was available in some Azure regions before March this year.

However, it only became available in the North and West Europe Azure regions during the first week of March. Which surprised me because sustainability is such an important topic in Europe.

Prior to March I was able to change the tenant settings for Sustainability solutions in the Admin portal beforehand. Plus, I deployed the Sustainability data solution to Microsoft Fabric through the Microsoft Cloud Solution Centre.

Deploying the Sustainability data solution in the Microsoft Cloud Solution Center
Deploying the Sustainability data solution in the Microsoft Cloud Solution Center

However, when I went into Microsoft Fabric, I could not see the Industry Solutions option at that time. Which is when I checked the Azure region availability for Sustainability data solutions and discovered that it was not yet available.

In fact, I found out it was available just as I was about to present a session for the Microsoft Fabric User Group Denmark last month.

Deploying the new ESG data estate capability in Microsoft Fabric

When I discovered that the Industry Solutions were finally available in Microsoft Fabric, I started the process to deploy the new ESG data estate capability.

I created a new workspace and then selected the “Industry Solutions” experience in the bottom-left hand corner. I then selected “Sustainability solutions” as below.

Selecting Sustainability solutions to perform initial tests of the new ESG data estate capability in Microsoft Fabric
Selecting Sustainability solutions

I then provided the name ESG_data_estate for my solution.

One key point about the solution name is that it will become the prefix for all your items that are deployed as part of the solution. I show this later in this post.

After providing a name I then selected the ESG data estate capability from the provided options. As you can see, there are other options available. Including the Microsoft Azure emissions insights capability, which allows you to visualize your Azure emissions data.

Selecting ESG data estate to perform initial testing of ESG data estate capability in Microsoft Fabric
Selecting ESG data estate

After selecting ESG data estate a screen appears explaining the purpose of the solution.

Screen that explains the purpose of ESG data estate
Screen that explains the purpose of ESG data estate

I like this screen because as well as describing the purpose of the solution it shows a nice diagram of how the data should ideally flow between the Lakehouses on the right-hand side.

Where the source data is first ingested into a raw Lakehouse. From there, the raw data is transformed to Delta tables in a processed Lakehouse. Which are then used to create metrics in a computed Lakehouse.

Interesting observation about ESG data estate Lakehouses

One interesting observation is the naming convention for the three main Lakehouses involved. Instead of bronze, silver and gold the capability names them ingested, processed and computed.

As you can see, the screen above also shows that all the items contain the solution prefix. Like I mentioned earlier. Just be aware that if you give your capability a long name it will be harder to navigate for objects within the workspace.

With this in mind, when I reference items in in the rest of this post I will refer to their prefix as ‘{Prefix}’. For easier reading.

Anyway, I then clicked on the green ‘Deploy to workspace’ button. Once the deployment had completed, I went into the workspace.

Creating demo data in the ESG data workspace

I wanted to install the ESG data estate demo data within the workspace first. So, I opened up the ‘{Prefix}_LoadDemoDataInProcessedESGDataTables_INTB’ notebook (the name will vary depending on the name of your solution).

I checked that all the cells in the notebook had been pre-populated with the correct information. Plus, that the relevant Lakehouses had already been added to the notebook. I then clicked on the ‘Run all’ button to run all the cells in the notebook.

After all the cells are completed, I went to the ‘{Prefix}_ProcessedESGData_LH’ Lakehouse and confirmed that the tables containing demo data had been created.

Checking tables in the processed Lakehouse
Checking tables in the processed Lakehouse

This is a good way to get some demo data. However, I must admit that it would be nice to have an alternative demo where data first appears in the raw Lakehouse. To help everybody realize the full flow of data.

Creating the metrics in the ESG data workspace

After checking the tables were there, I then ran the notebooks to create the tables which contain the ESG metrics. To generate the required tables in the computed Lakehouse.

Which you can read about these notebooks in detail in the ‘Compute analytical datasets and metrics‘ section of the ESG data estate documentation online.

All the default notebooks follow a similar pattern to create metrics for each area. First one notebook creates the Fact tables in the Files section within of the computed Lakehouse. By extracting data stored in tables in the Processed Lakehouse.

Afterwards, another notebook creates various metric tables as Delta tables in the compute Lakehouse. By extracting data from the Fact tables that were created by the previous notebook. Most of the tables are named as per the metrics that are required for CSRD reports.

Like in the below example based that creates metrics for Greenhouse Gas (GHG) emissions. Where I first created the Fact tables in the Files location with the ‘{Prefix}_CreateEmissionsFactTables_INTB’ notebook, and then the metrics tables in the Tables location with the ‘{Prefix}_GenerateEmissionsMetricTables_INTB’ notebook.

Created Facts and Metrics
Created Facts and Metrics

Sample CSRD report

After creating all the documented Facts and Metrics I then decided to test the sample CSRD Metrics report that comes with the capability. Which you can read about in more detail in the ‘Visualize data and analytics‘ documentation online.

As you can see the report worked as expected.

Sample CSRD report
Sample CSRD report

Of course, this is just a sample report which you can customize to suit your needs.

Creating empty tables

Creating the populated demo tables is all well and good, but what if you want to create empty tables? So that you can map source data that is in the raw Lakehouse to delta tables that exist in the processed Lakehouse.

Well, this capability comes with a GenerateESGTables notebook. Which allows you to create a group of empty ESG model tables based on one of the below business areas.

  • GHG Emissions
  • Waste Sustainability
  • Water Sustainability
  • Environmental Social and Govenance
  • Circularity
  • Business Management
  • Party

To identify the schema of a certain table or to find which business area it belongs to you can view the ‘ESGschema.json’ file that exists in the ‘{Prefix}_ConfigandDemo’ Lakehouse. Some tables can belong to multiple business areas.

For example, within the demo tables the information about learning events are not created. However, they are required for certain reporting purposes. In order to find the table and which business areas it belongs in you must find it in the ESGSchema.json file.

Finding business area in ESGSchema json file
Finding business area in ESGSchema json file

Because the LearningEvent table is in multiple business areas I chose one ‘Party’ as a parameter in the GenerateESGTables. As you can see in the below example.

TARGET_LAKEHOUSE = "SDS_ESGDE_ESG_TableCreation_Test_ProcessedESGData_LH"
CONFIG_DATALAKE_PATH = "abfss://xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx@onelake.dfs.fabric.microsoft.com/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx"
SELECTED_BUSINESS_AREA = "Party"

Afterwards, I ran all the cells in the notebook which created all the tables that are classed as being in the Party business area for me.

In reality, there are other methods to help you identify tables in the ESG model. However, this is a method that you can work directly in Microsoft Fabric.

Final words about my initial testing of the new ESG data estate capability

I hope that me sharing the results of my initial testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabric inspires some of you to look into this further.

Because measuring sustainability is important. Especially for companies that are affected by the Corporate Sustainability Reporting Directive (CSRD).

To learn more about it, I recommend going through the “Get started with Sustainability data solutions in Microsoft Fabric” learning path. Which is available in Microsoft Learn.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Published inMicrosoft Fabric

5 Comments

  1. […] Including leveraging my experience with the sustainability data solutions in Fabric. Which I wrote some posts about. Including one where I share the results of my initial testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabr…. […]

Leave a Reply

Your email address will not be published. Required fields are marked *