Press "Enter" to skip to content

Category: Microsoft Fabric

Dynamic Warehouse and Lakehouse Connections in Data Pipelines

Koen Verbeeck doesn’t want to hard-code the connection string:

When you develop data pipelines in Microsoft Fabric (the Azure Data Factory equivalent in Fabric, not to be confused with deployment pipelines), you will most likely have some activities with a connection to a warehouse, a lakehouse or a KQL database (for the remainder of the blog post I’ll talk about a warehouse, but it can be any of those three data stores). For example, in a Script, Lookup, or Copy activity. When you deploy your data pipeline to another workspace – using, you might’ve guessed it, deployment pipelines – the pipeline itself is copied to the other workspace. E.g., we deploy a pipeline from the development workspace to the test workspace.

Read on to see what this means for warehouse connections and how you can work around the existing messiness.

Comments closed

Build a Custom Semantic Model for Microsoft Fabric

Reza Rad offers up some advice:

The Lakehouse or Warehouse comes with a default Power BI Sematic model, which can be used for reporting and analytics. However, you can also build and use a customized semantic model. There are significant differences when using the semantic model in real-world analytics projects. In this article, I’ll explain the difference between these two, which one is recommended, and why.

Click through for the video, as well as the article.

Comments closed

Microsoft Fabric: Lakehouse or Warehouse?

Koen Verbeeck helps us choose:

This doesn’t mean no code has to be written. On the contrary, in this article we’re going to focus on two services of Fabric: the lakehouse and the warehouse. The first one is part of the Data Engineering experience in Fabric, while the latter is part of the Data Warehousing experience. Both require code to be written to create any sort of artefact. In the warehouse we can use T-SQL to create tables, load data into them and do any kind of transformation. In the lakehouse, we use notebooks to work with data, typically in languages such as PySpark or Spark SQL.

Read on for the comparison. I tend to go more for the lakehouse experience rather than warehouse, but Koen provides a lot of the info you’d need in order to make the right decision for yourself.

Comments closed

Dataverse and Microsoft Fabric Gotchas

Marc Lelijveld shares some advice:

Recently, I architected a solution for a client for their Microsoft Fabric data platform. The client works with Dynamics Finance & Operations as one of their main ERP system. Fabric offers easy ways to bring data from various standard Microsoft services into the platform, however it is not always as easy as it looks like. In this blog I will elaborate on the gotcha’s encountered in architecting this solution.

Read on for the challenges that Marc ran into along the way.

Comments closed

Stop and Start Fabric via Power Automate

Gilbert Quevauvilliers saves some money:

Stop and start your Fabric Capacity using Power Automate

With Fabric Capacities trial coming to an end, you need to make sure to stop and start Fabric Capacities.

In my blog post below, I am going to show you how I can start or stop my Fabric Capacity by simply sending an email to myself with the details in the Subject Line to start or stop the capacity.

That’s a pretty neat method, especially if you have odd hours you want to run the capacity.

Comments closed

System Views and Distributed Processing in Microsoft Fabric

Koen Verbeeck runs into an annoying error:

I have a metadata-driven ELT framework that heavily relies on dynamic SQL to generate SQL statements that load data from views into a respective fact or dimension. Such a task is well suited for generation, since the pattern to load a type 1 SCD, type 2 SCD or a fact table is always the same.

To read the metadata of the views, I use a couple of systems views, such as sys.views and sys.sql_modules. At some point, I join this metadata (containing info about the various columns and their data types) against metadata of my own (for example, what is the business key of this dimension). This all works fine in Azure SQL DB or SQL Server, but in my Fabric warehouse I was greeted with the following error:

The query references an object that is not supported in distributed processing mode.

Read on to learn more about why you get this error and one workaround for it.

Comments closed

Getting the Top N Results in a PySpark Notebook

Gilbert Quevauvilliers only needs the top 1:

How to get the TopN rows using Python in Fabric Notebooks

When working with data there are sometimes weird and wonderful requirements which must be created in order to get to the desired solution.

In today’s blog post I had a situation where I wanted to get a single row with the highest duration.

Gilbert uses the Spark SQL version, specifically the Python function variant. You could also use Spark SQL and write a query using the LIMIT operator.

Comments closed

Comparing Microsoft Fabric Warehouse and Lakehouse Performance

Reitse Eskens busts out the stopwatch:

I just can’t seem to stop doing this, checking the limits of Microsoft Fabric. In this instalment I’ll try and find some limits on the data warehouse experience and compare them with the Lakehouse experience. The data warehouse is a bit different compared to the Lakehouse, so I’ll be digging into that one first. Then I’m going to load data into the warehouse with a copy data pipeline followed by some big queries to test performance. The Fabric Capacity App will be used to check out the capacity necessary (or used for that matter).

As usual, I’m using the F2 capacity as it’s the one that should break the easiest. It’s also the cheapest one to run tests against and, as the capacity calculation isn’t dependent on the SKU (Stock Keeping Unit), you can easily translate to find out which capacity SKU will fit the workload. Remember that your workload will differ from the one shown in this blog. These tests are a comparison between the different offerings, something you could do for yourself. These blogs are a bit of a happy place as every option will get a good chance. In your work, your skills (and those of your co-workers) will be a major driver towards an option. Even if this offers the chance to learn something new!

Reitse focuses on ingesting and transforming data and the results were quite interesting.

Comments closed

Reviewing the Microsoft Fabric Roadmap

Paul Turley has a report:

I created this report to summarize the release status for all Fabric features that are documented in the Microsoft Fabric Roadmap. The information on this report is collected and updated frequently from the Fabric Roadmap hosted by Microsoft, and displays status information in a convenient, single-page Power BI report. Each feature area on the report has links to the detail documentation on the official roadmap site. For convenience, I’ve shortened the report path to TinyURL.com/FabricRoadmapReport.

Check out that link to see the report.

Comments closed