Press "Enter" to skip to content

Category: Microsoft Fabric

Stop and Start Fabric via Power Automate

Gilbert Quevauvilliers saves some money:

Stop and start your Fabric Capacity using Power Automate

With Fabric Capacities trial coming to an end, you need to make sure to stop and start Fabric Capacities.

In my blog post below, I am going to show you how I can start or stop my Fabric Capacity by simply sending an email to myself with the details in the Subject Line to start or stop the capacity.

That’s a pretty neat method, especially if you have odd hours you want to run the capacity.

Comments closed

System Views and Distributed Processing in Microsoft Fabric

Koen Verbeeck runs into an annoying error:

I have a metadata-driven ELT framework that heavily relies on dynamic SQL to generate SQL statements that load data from views into a respective fact or dimension. Such a task is well suited for generation, since the pattern to load a type 1 SCD, type 2 SCD or a fact table is always the same.

To read the metadata of the views, I use a couple of systems views, such as sys.views and sys.sql_modules. At some point, I join this metadata (containing info about the various columns and their data types) against metadata of my own (for example, what is the business key of this dimension). This all works fine in Azure SQL DB or SQL Server, but in my Fabric warehouse I was greeted with the following error:

The query references an object that is not supported in distributed processing mode.

Read on to learn more about why you get this error and one workaround for it.

Comments closed

Getting the Top N Results in a PySpark Notebook

Gilbert Quevauvilliers only needs the top 1:

How to get the TopN rows using Python in Fabric Notebooks

When working with data there are sometimes weird and wonderful requirements which must be created in order to get to the desired solution.

In today’s blog post I had a situation where I wanted to get a single row with the highest duration.

Gilbert uses the Spark SQL version, specifically the Python function variant. You could also use Spark SQL and write a query using the LIMIT operator.

Comments closed

Comparing Microsoft Fabric Warehouse and Lakehouse Performance

Reitse Eskens busts out the stopwatch:

I just can’t seem to stop doing this, checking the limits of Microsoft Fabric. In this instalment I’ll try and find some limits on the data warehouse experience and compare them with the Lakehouse experience. The data warehouse is a bit different compared to the Lakehouse, so I’ll be digging into that one first. Then I’m going to load data into the warehouse with a copy data pipeline followed by some big queries to test performance. The Fabric Capacity App will be used to check out the capacity necessary (or used for that matter).

As usual, I’m using the F2 capacity as it’s the one that should break the easiest. It’s also the cheapest one to run tests against and, as the capacity calculation isn’t dependent on the SKU (Stock Keeping Unit), you can easily translate to find out which capacity SKU will fit the workload. Remember that your workload will differ from the one shown in this blog. These tests are a comparison between the different offerings, something you could do for yourself. These blogs are a bit of a happy place as every option will get a good chance. In your work, your skills (and those of your co-workers) will be a major driver towards an option. Even if this offers the chance to learn something new!

Reitse focuses on ingesting and transforming data and the results were quite interesting.

Comments closed

Reviewing the Microsoft Fabric Roadmap

Paul Turley has a report:

I created this report to summarize the release status for all Fabric features that are documented in the Microsoft Fabric Roadmap. The information on this report is collected and updated frequently from the Fabric Roadmap hosted by Microsoft, and displays status information in a convenient, single-page Power BI report. Each feature area on the report has links to the detail documentation on the official roadmap site. For convenience, I’ve shortened the report path to TinyURL.com/FabricRoadmapReport.

Check out that link to see the report.

Comments closed

Dynamic Historical Partition Refresh in Power BI

Marc Lelijveld digs into partition refreshing:

I’ve heard the question pretty often from customers: “You told me to use incremental refresh, but how can I regularly run a full load or refresh onder partitions?” Well, there are perfect ways to do this using Tabular Editor or SQL Server Management Studio. But this often includes manual work to trigger the processing.

Today, this question was asked again to me. I thought, there should be a smarter way to do this. Since I recently explored more in the wonderful world of Fabric Notebooks and Python, decided to dive a bit deeper in this world and see if it is possible to script something like this using Semantic Link. And obviously, the answer is “Yes!”

Read on to learn how to do it with a bit of Python and Microsoft Fabric’s Semantic Link library (sempy).

Comments closed

Looping through Data in Microsoft Fabric PySpark Notebooks

Gilbert Quevauvilliers builds a loop:

Continuing with my existing blog series on what I’m learning with notebooks and PySpark.

Today, I’m going to explain to you how I found a way to loop through data in a notebook.

In this example, I’m going to show you how I loop through a range of dates, which can then be used in a subsequent query to extract data by passing through each date into a DAX query.

Click through for Gilbert’s example. Here’s an alternative using something called a list comprehension. First, build a function that does what you want to do—that’d be the innards of Gilbert’s Python code, lines 31-54.

def perform_dax_query(row):
    var_Date = row["Date"]
    ...
    display(df_DAX_QueryResult)

Then, call that function for each row:

[perform_dax_query(row) for row in data_collect]

In this particular scenario, I’d personally stick with Gilbert’s composition, but in cases where you’re transforming a list of elements into a new list—for example, if you’re performing some data cleanup for each row in a list and you want the output to be a new list with cleaned-up data—then the list comprehension works really well.

Comments closed

An Introduction to the Native Execution Engine in Microsoft Fabric

Sandeep Pawar gives us the gentle version:

At MS Build, Microsoft announced availability of Native Execution Engine in Fabric. The product team will have detailed documentation and technical details but I will attempt to provide an ELI5 version of what it means and how it works at a 30000 ft level.

Read on to learn what the Native Execution Engine is and why Microsoft would use it versus running everything in the Java Virtual Machine like other services in the broader Hadoop ecosystem have historically done (at least at the beginning, before they built their own native execution engines!).

Comments closed

Building a Power Query Template

Nikola Ilic looks at Power Query templates:

In this “ocean” of innovations, there are certain features that don’t get the deserved limelight – as they somehow go under the radar. Some of them, I really consider “hidden gems” – you might not use them in each and every solution, but in some scenarios, they can be of immense help.

A hidden gem that I’m introducing today is called Power Query Template. As of today, this feature is still in preview (the same as many others in Microsoft Fabric), but this doesn’t minimize its potential.

Read on to see why Nikola likes capability this so much.

Comments closed

Time Travel in the Microsoft Fabric Warehouse

Reza Rad hops in the Delorean:

Data changes throughout time, especially in the world of BI and data warehousing systems; the data gets updated through ETL processes frequently. This means that the data you see in the warehouse today might differ from yesterday and the day before, and so on. Some parts of this data can be retrieved on a timely basis. You can, for example, query the sales amount from the sales table where the date has been the 2nd of April. That would give you the sales amount for the 2nd of April, even if you are querying it on the 23rd of May.

However, what if some of the sales transactions on the 2nd of April got updated? The sales amount you see would likely be the updated amount, but not the original amount. It is sometimes useful to be able to see what was that original amount, or in other words, travel in time and see what that value was.

Click through for a combination video and article. The syntax isn’t quite the same as with temporal tables in SQL Server, though it’s close enough to follow along if that’s your relevant experience.

Comments closed