Press "Enter" to skip to content

Category: Microsoft Fabric

Concurrent Evaluation with Microsoft Fabric Dataflows Gen2

Chris Webb runs multiple jobs at once:

Did you know that if your Fabric Dataflows Gen2 contains several queries then you can control how many of them are evaluated in parallel when your dataflow refreshes? In this series I’ll look at how how you can do this and how it may result in better performance – at least in some cases.

Let’s start with the basics. I created a Dataflow Gen2 with ten queries which each returned a table of one row and one column after one minute. I used the #table function to generate the table without connecting to a data source, code from this post to add the delay and the trick in this post to make sure the delay was only applied when the dataflow refreshed. The output of each query was loaded to a Fabric Warehouse.

Click through for a demonstration.

Leave a Comment

Explaining the Fabric Ontology

James Serra takes us through a big word:

For years, most data conversations have started with tables. We ask where the data lives, what columns are available, how the joins work, and whether the data is in a warehouse, lakehouse, semantic model, or some other system. That makes sense, because tables are how most of us have worked with data for decades. But tables are not how the business thinks.

A business thinks in terms of customers, products, orders, shipments, assets, flights, runways, employees, policies, and actions. The problem is not usually a lack of data. The problem is a lack of shared meaning. Organizations often have the same business concept represented multiple ways across teams and systems, creating what I would call semantic drift. Sales may define a customer one way. Finance may define it another way. Operations may have yet another version in a different system with different keys, names, and assumptions. That is exactly where Fabric Ontology becomes important. It is designed to close the gap between physical data structures and business meaning.

Microsoft is a bit late to the ontology game and their current concept of an ontology shows. I can understand where they’re going but they still have a ways to go.

Comments closed

CLUSTER BY in Microsoft Fabric Data Warehouse

Nikola Ilic shows off a relatively new feature:

The first thing every Fabric architect reaches for in this situation is the usual playlist: check the query plan, look at the joins, validate the statistics, maybe scale up the capacity. All worth doing, but none of those things addressed what was actually happening: the warehouse was scanning the entire table for every filtered query, because there was no way to tell it which Parquet files actually contained the rows we cared about.

However, Microsoft shipped data clustering in preview at the end of November 2025, and the entire conversation changed.

In this article, I want to walk you through what data clustering is, how it works under the hood, and most importantly, I’ll show you a real demo on a 100-million-row clickstream table that you can run in your own warehouse. No abstractions, no marketing numbers, but actual T-SQL you can paste.

Some of the notes Nikola mentions remind me of some of the rules around making columnstore indexes work and for much of the same reason. But as Nikola’s demo shows, this is definitely a “You must be this tall to ride the ride” feature, and unless you’re talking about quite large fact tables with (at a minimum) billions of rows of data, the benefit mostly comes from reducing CUs rather than wall clock time improvements.

Comments closed

Exceeding the Capacity Limit for Power BI Dataset Refreshes

Chris Webb explains an error:

If you have a lot of Power BI semantic models that are scheduled to refresh at the same time in the Service then you may find that some of them fail with the following error:

You’ve exceeded the capacity limit for dataset refreshes. Try again when fewer datasets are being processed.

[Note: “dataset” is the old name for a Power BI semantic model. Someone should update the error message.]

Read on to see what can cause this error and what you can do about it.

Comments closed

Metadata-Driven Frameworks for Change Detection in Microsoft Fabric

Kevin Chant builds a table:

I had various options for this months contribution due to my experience with various change detection solutions. Including Azure Synapse Link for SQL Server 2022. Which I covered in previous posts. Including one that covered some excessive file tests for Azure Synapse Link for SQL Server 2022.

In the end I decided to cover developing metadata-driven frameworks for Microsoft Fabric. Due to the fact that it is such a hot topic for multiple reasons. One of which is the growing availability of open-source, metadata-driven frameworks for Microsoft Fabric.

Read on for three such frameworks and some advice on how to use them.

Comments closed

Using Fabric Data Wrangler for Testing

Kristina Mishra checks out some data:

Data Wrangler has been available for awhile now, but I’ll be honest, it’s not something we’ve been actively using. We’ve been heads down on time-sensitive projects for over a year and needless to say, our cup runneth over. Recently we’ve had a bit of respite and I decided to see how we could use Data Wrangler within the context of our current Microsoft Fabric data warehouse (i.e. medallion layer lakehouses).

Data Wrangler has a lot of cool features that will give you code snippets for what you want to do, but I wanted to use it a different way. I wanted to have an easy way to do a quick check for dimension tables. I also wanted an easy-peasy way for others, some of whom are not developers, to be able to do quick sanity check of the data.

Click through to see how it works.

Comments closed

Microsoft Fabric Eventhouse Caching and Retention

Nikola Ilic notes the ephemeral nature of life:

You spin up your first Eventhouse, ingest some IoT data, fire up a KQL query, and it runs fast. When I say fast, I mean embarrassingly fast. A few weeks later, you query data from a couple of months ago, and… it’s still fast, but maybe a tiny bit slower. A year later, the same query starts to feel sluggish. Two years later, you can’t find some of the data at all.

Welcome to the world of tiered storage in Real-Time Intelligence!

And when Nikola mentions how fast data in hot storage is, that’s no exaggeration. It is, to my knowledge, the fastest way of retrieving data in Microsoft Fabric.

Comments closed

Microsoft Fabric April 2026 Feature Summary

kamurray has a big list of updates:

This month’s update brings a broad set of new capabilities across Microsoft Fabric, spanning the platform experience, Data Engineering, Data Science, Data Warehouse, and Real-Time Intelligence. Read on to learn about improvements to the Fabric experience, deeper VS Code integration, enhanced notebook resiliency, expanded machine learning and governance features, and new real-time data processing capabilities.

Click through to see what’s new.

Comments closed

Choosing between Power Apps and Translytical Task Flows

Nicky van Vroenhoven gives the standard consulting answer:

I think I have gotten this question at least five or six times in the last few months, and with Translytical Task Flows reaching GA in the March 2026 Power BI update, I expect it to come up even more. So let me write it down once and for all.

The question usually sounds something like: “We want users to be able to add comments or update values in their Power BI report. Should we use Power Apps or this new Translytical Task Flows thing?”

My honest answer is: it depends 😆, but the decision is simpler than you might think.

Click through for the decision criteria.

Comments closed

Retrieving Materialized Lake View Lineage and Refresh Times

Meagan Longoria wants information:

Materialized lake views (MLVs) in Microsoft Fabric are an effective way to implement medallion architecture declaratively, but once you have a pipeline of MLVs in production, you need visibility into whether they’re current. Fabric’s MLV management area gives you a visual lineage and refresh history, but if you want to build automated alerting, logging, or custom tooling, you need to get that information programmatically. This post walks through one way to do that, using a small demo lakehouse built entirely in a Fabric notebook.

Click through for that demonstration.

Comments closed