Press "Enter" to skip to content

Category: Microsoft Fabric

Building Metadata-Driven Pipelines in Microsoft Fabric

Koen Verbeeck lays out a process:

The goal of metadata driven code is that you build something only once. You need to extract from relational databases? You build one pipeline that can connect to a relational source, and you parameterize everything (server name, database name, source schema, source table, destination server name, destination table et cetera). Once this parameterized piece of code is ready, all you must do is enter metadata about the sources you want to extract. If at a later point an additional relational source needs to be extracted, you don’t need to create a brand-new pipeline. All you need to do is enter a new line of data in your metadata repository.

Aside from speeding up development – after you’ve made the initial effort of creating your metadata driven pipeline – is that everything is consistent. You tackle a certain pattern always in the same way. If there’s a bug, you need to fix it in one single location.

Read on to see how this works. The idea is certainly not new, as Koen mentions, but there are some specific factors that come into play for Microsoft Fabric pipelines.

Comments closed

Number of Fabric Workspaces and the Medallion Architecture

Kevin Chant opens a can of worms:

Since I got asked about it this week during the Learn Together session I did alongside Shabnam Watson (l/X). Plus, it is a highly debated topic in our community, and I wanted to share my thoughts about it.

Due to the fact that my personal opinion is that it depends. However, the number you choose depends on a variety of reasons which I intend to cover in this post.

By the end of this post, you will know my personal opinions as to why. Plus, plenty of things to consider when deciding on the number of workspaces to implement.

Read on for Kevin’s thoughts. My quick opinion is, one workspace per layer. Just from a logistical standpoint, keeping the several layers separated in one workspace is an immense challenge and typically requires exposing data engineering details (like what “gold”/”silver” or “curated”/”refined” actually means) with end users.

Comments closed

Getting Row Counts of All Tables in a Microsoft Fabric Warehouse

Koen Verbeeck busts out the tally counter:

It says the data is 352MB in size, but after loading the data I was curious about how many rows were actually in that sample data set. Unfortunately, it’s not as straight forward as with a “normal” SQL Server database to get the row counts. First of all, when you connect with SSMS to the database there’s sadly no option to get the row counts report:

The post is a little depressing, really. But still worth the read.

Comments closed

Fabric Workload Items in the Scanner API

Gilbert Quevauvilliers checks out the latest changes to the Scanner API:

All Fabric Workload Items are now available from the Scanner API

I was working with the customer and was looking for some information in the Scanner API.

For a change I went into Power Query and expanded the workspaces item.

Read on for more information. And if you’d like to learn more about the Scanner API, here’s a sample application showing how to use it.

Comments closed

Visualizing a Spark Execution Plan

Gerhard Brueckl builds a very helpful tool:

I recently found myself in a situation where I had to optimize a Spark query. Coming from a SQL world originally I knew how valuable a visual representation of an execution plan can be when it comes to performance tuning. Soon I realized that there is no easy-to-use tool or snippet which would allow me to do that. Though, there are tools like DataFlint, the ubiquitous Spark monitoring UI or the Spark explain() function but they are either hard to use or hard to get up running especially as I was looking for something that works in both of my two favorite Spark engines being Databricks and Microsoft Fabric.

Read on for Gerhard’s answer, including an example of it in action.

Comments closed

Understanding the Delta Lake Format

Reza Rad has a new post and video combo:

Please don’t get lost in the terminology pit regarding analytics. You have probably heard of Lake Structure, Data Lake, Lakehouse, Delta Tables, and Delta Lake. They all sound the same! Of course, I am not here to talk about all of them; I am here to explain what Delta Lake is.

Delta Lake is an open-source standard for Apache Spark workloads (and a few others). It is not specific to Microsoft; other vendors are using it, too. This open-source standard format stores table data in a way that can be beneficial for many purposes.

In other words, when you create a table in a Lakehouse in Fabric, the underlying structure of files and folders for that table is stored in a structure (or we can call it format) called Delta Lake.

Read on to learn more about this open standard and how it all fits together with Microsoft Fabric.

Comments closed

Exporting and Sharing Power BI Reports in Fabric

Sandeep Pawar distributes PDFs like candy:

With the proposed solution below, you will be able to :

  • Export a Power BI report, or a page of a report or a specific visual from any page as a PDF, PNG, PPTX or other supported file formats
  • Apply report level filters before exporting
  • Automate the extracts on a schedule
  • Save the exported reports to specific folders
  • Grant access to individual folders in the Lakehouse

Click through for the solution.

Comments closed

Searching for Tenant Settings in Microsoft Fabric

Nicky van Vroenhoven performs a search:

You probably also use the same method as I did to search through the Admin portal and tenant settings: CTRL + F from your browser. It does the trick, but not very well. 

For example, it only searches the titles of the settings, not the descriptions.

Next to that, you also can get a lof matches that you have to scroll or loop through, which makes it not very clear because more often than not, you don’t know in what section of the tenant settings you ended up.

Read on for an alternative method of searching. Or, I guess, two of them because without Nicky’s post, it can be easy to confuse the two search boxes.

Comments closed