In this post I want to cover a Data Platform area I recently recommended to a colleague. Since the answer surprised me.
A while back a colleague who is studying for Azure Data Platform certifications mentioned that there were a lot of areas they could focus on after their exams. They then asked me if there was a specific area they should focus on first.
My advice was to look at the services that are able to query data directly from text-based files. For example, data stored in csv or parquet files. Because I have noticed an increase in demand for this. Due to various reasons.
For example, it can help reduce costs. Plus, it can also help remove concerns about vendor lock-in. Because the files can be moved elsewhere to be read from another service if needed.
Plus, if you use the Data Lakehouse paradigm that I introduced in a previous post it potentially removes the need for an additional service to use as a Data Warehouse solution.
Instead, you can query directly from a service that supports querying the data stored in text-based files. For example, Azure Databricks or Azure Synapse Analytics. Like in the below diagram.
Both of the above services can query data stored in files directly using Spark. In addition, you can use serverless SQL Pools in Azure Synapse Analytics to query data stored in files using T-SQL syntax.
In fact, I showed how to do all of this with the above services during my session at Data Saturday Croatia this year. Where I showed both services reading the same files.
In addition, there are other offering you can do this with as well which are gaining popularity. Such as DuckDB.
Other Data Platform areas
One thing I really want to stress in this post is that there is still a need for the other Data Platform offerings. For example, SQL Server and its cloud-based variants. Which has a very wide user base.
My main reason for covering my recommendation in this post is due to the fact that I have noticed a rise in demand for it. Especially in the area of BI and analytics.
Now, there are some sessions relating to this recommendation at this weeks Microsoft Ignite as you can see below.
- Microsoft and Databricks Partnership
- Unlocking the power of your data estate with Azure Databricks Lakehouse
- Self-Service Model to increase Enterprise Analytics Adoption using Azure Synapse
- Build an open standard data lakehouse using Azure Synapse Analytics and Azure Databricks
Of course, there will also be other great sessions presented at Microsoft Ignite this year. Such as the session on how to innovate faster and achieve greater agility with the Microsoft Intelligent Data Platform.
With this in mind, I do recommend looking at the schedule if you have not already.
Final words about this Data Platform area
I am very interested others thoughts about my recommendation. Because I appreciate that there will be mixed reactions about it. Which is understandable.
With this in mind, if you have any questions or comments about this post feel free to leave a comment.
Be First to Comment