Press "Enter" to skip to content

Category: Microsoft Fabric

Plotting the ROC Curve in Microsoft Fabric

Tomaz Kastrun gets plotting:

ROC (Receiver Operation Characteristics) – curve is a graph that shows how classifiers performs by plotting the true positive and false positive rates. It is used to evaluate the performance of binary classification models by illustrating the trade-off between True positive rate (TPR) and False positive rate (FPR) at various threshold settings.

Read on to see how you can generate one in a Microsoft Fabric notebook. Tomaz also plots a density function for additional fun.

Comments closed

Mounting Azure Data Factory in Fabric Data Factory

Andy Leonard takes up a factory job:

Thanks to the hard work of the Microsoft Fabric Data Factory Team, it’s now possible to mount an Azure Data Factory in Fabric Data Factory. This post describes one way to mount an existing Azure Data Factory in Fabric Data Factory. In this post, we will:

  • Mount an existing Azure Data Factory in Fabric Data Factory
  • Open the Azure Data Factory in Fabric Data Factory
  • Test-execute two ADF pipelines
  • Modify and publish an ADF pipeline

Read on to see how it all works. One of the odd things about Microsoft Fabric—and its predecessor, Azure Synapse Analytics—is the penchant for similar-but-not-quite-the-same services. Yes, we have Data Factory…but it’s not quite the same. Yes, we have Azure Data Explorer (and KQL)…but it’s not quite the same. I get that there are reasons for this (such as not having a resource group with a dozen separate services hanging around), but I’m sure it’s a bit frustrating working on several separate code bases and trying to keep them all approximately in sync.

Comments closed

Programmatic Power BI Report Modification via semantic-link-labs

Kurt Buhler makes a change:

Whether building reports in Power BI Desktop or in the web browser via the Power BI service, you have limited options to batch or streamline changes. Put another way; it’s tedious and slow to make many small changes to one or more Power BI reports. It’s also easy to make mistakes

When initially designing or building a report, this is not so much of a problem. Unless you’re using a template, you want to control report layout and formatting, yourself. However, certain changes can be little more than a waste of time. Some examples include:

  • Replacing fields when there’s a broken reference due to i.e. renaming a model measure or column.
  • Swapping one measure or column for another in the report
  • Changing visual container styles, like background, border, and shadow/glow.
  • Changing text or text styles across multiple visuals, pages, or reports.
  • Changing chart formatting (like color) or properties (like edit interactions) across multiple visuals, pages, or reports.

Read on to see how you can make some of these changes in Python code using the semantic-link-labs library.

Comments closed

Managing Power BI Assets with semantic-link-labs

Kurt Buhler takes us through a Python library:

Thus far, the part of Microsoft Fabric that I’ve personally found the most interesting is not Copilot, Direct Lake, or its data warehousing capabilities, but a combination of notebooks and simple file/table storage via Lakehouses. Specifically, the library semantic link and its “expansion pack” semantic-link-labs, spearheaded by Michael Kovalsky. These tools help you build, manage, use, and audit the various items in Fabric from a Python notebook, including Power BI semantic models and reports.

Semantic-link-labs provide a lot of convenient functions that you can use to automate and streamline certain tasks during Power BI development; both of models and reports. For me, I’m particularly interested in the reporting functionalities, because this is where I typically find that I lose the most time, and because there is a drought of tools to address this area.

Read the whole thing.

Comments closed

Implementing a Star Schema in a Microsoft Fabric Lakehouse

Nikola Ilic builds a lakehouse:

But, what is a star schema in the first place? I have good and bad news for you:)…The bad news is that I’m not covering it in this article because this one focuses on explaining how to implement a star schema in Fabric Lakehouse (assuming that you already know what star schema is). The good news is: I’ve already written about it, so go and read this article first, if you’re not sure what star schema represents in the world of data modeling…

In one of the previous articles, I also shown how to implement a star schema in Power BI, by leveraging Power Query Editor.

Now, let’s get our hands dirty and build a star schema by using PySpark in the Fabric notebook!

Click through to see how.

Comments closed

Role-Playing Dimensions in Direct Lake

Chris Webb puts on a mustache and changes his shirt really quickly:

Note that the Sales fact table has two date columns, OrderDate and ShipDate.

If you create a DirectLake semantic model using the Web Editor and add these two tables you could rename the Date table to Order Date and build a relationship between it and the OrderDate column on the Sales table:

What about analysing by Ship Date though? You could create a physical copy of the Date table in your Lakehouse and add that to the model, but there’s another option.

Read on for that answer. Interesting that, as of right now, the primary way to do this is with third-party software.

Comments closed

Tracking Microsoft Fabric Notebook Progress

Gilbert Quevauvilliers asks are we there yet? are we there yet?

How to view or track the progress of Notebook while it is running in Microsoft Fabric

I was recently working with a Notebook in Microsoft Fabric that was started via a Data Pipeline.

The challenge I had was that I had no idea how far the notebook had gone (as there were quite a lot of cells in this particular notebook).

In this blog post I am going to show you how I can use Microsoft Fabric to identify exactly which cell my notebook is currently on.

Click through for the answer. And so help me, if you ask that question one more time, I’m turning this thing around and we’re going back home.

Comments closed

Data Ingestion with Microsoft Fabric Copy Jobs

Reitse Eskens spends a bunch of time at the copier:

The copy job is essentially an abstraction of a pipeline reading data from the source system and writing the data into either a Lakehouse or a Warehouse. It really is ingesting data and nothing else. In my opinion that what copy data flows are meant to do and are very good at too.

The big challenge we all keep facing is how to create incremental loads. We have to build some sort of metadata database where we keep the latest ID, data or other column we use to discern the increment on. In our flow, we need to get that value, compare it against the source system and get the differences. The biggest task is to find out if records are deleted.

With the Copy Job, a large part of this task is taken out of your hands. The Copy Job has a configuration GUI (or wizard) that helps you out quite quickly. So let’s not waste anymore characters and dig in!

Read on to see how it works and its capabilities and limitations. The key question, as always, is whether your workload fits into the wheelhouse. If so, this sounds really useful. If not, it’s a proper struggle.

Comments closed

The Importance of Semantic Link

Nikola Ilic excerpts from a forthcoming book:

Since Microsoft Fabric was publicly unveiled in May 2023, there has been an ocean of announcements around this new platform. In full honesty, plenty of those were just a marketing or rebranding of the features and services that already existed before Fabric. Hence, in this ocean of announcements, some features went under the radar, with their true power still somehow hidden behind the glamour of those “noisy neighbors”. 

Semantic Link is probably one of the best examples of these hidden Fabric gems. 

Click through to learn more about Semantic Link and check out Nikola and Ben Weissman’s book as well.

Comments closed

Documenting Microsoft Fabric Workspaces via Semantic Link Labs

Prathy Kamasani does a bit of documentation:

Documentation is a critical and tedious part of every project. However, it is essential to review existing developments or document new ones. When the Power BI API was initially released, I attempted to do similar things. I wanted to know how to use the API to obtain an inventory of a tenant – Power BI Template – Prathy’s Blog…. Now, I believe I am achieving the same goal but using my current favourite functionality, Fabric Notebooks.

In this blog post, I will discuss using Semantic Link and Semantic Labs to get an overview of workspaces and their contents within specified workspaces via Fabric Notebook. This is just a way of doing it; plenty of blogs discuss various things you could do with Semantic Link. Also, I want to use this to document what I have learned. I like how I can generate a Lakehouse and automatically create Delta Tables as needed.

Click through to learn more about how this works.

Comments closed