Press "Enter" to skip to content

Category: Microsoft Fabric

Fabric Benchmarking: Moving CSV Files

Eugene Meidinger breaks out the abacus:

First, a disclaimer: I am not a data engineer, and I have never worked with Fabric in a professional capacity. With the announcement of Fabric SQL DBs, there’s been some discussion on whether they are better for Power BI import than Lakehouses. I was hoping to do some tests, but along the way I ended up on an extensive Yak Shaving expedition.

I have likely done some of these tests inefficiently. I have posted as much detail and source code as I can and if there is a better way for any of these, I’m happy to redo the tests and update the results.

Part one focuses on loading CSV files to the files portion of a lakehouse. Future benchmarks look at CSV to delta and PBI imports.

I think Eugene did a fine job documenting everything in the process, and it was interesting to see relative price differences between different techniques for uploading a very large CSV file.

Comments closed

Sending Alerts from Fabric Workspace Monitoring

Chris Webb has a new Bat-signal:

I’ve always been a big fan of using Log Analytics to analyse Power BI engine activity (I’ve blogged about it many times) and so, naturally, I was very happy when the public preview of Fabric Workspace Monitoring was announced – it gives you everything you get from Log Analytics and more, all from the comfort of your own Fabric workspace. Apart from my blog there are lots of example KQL queries out there that you can use with Log Analytics and Workspace Monitoring, for example in this repo or Sandeep Pawar’s recent post. However what is new with Workspace Monitoring is that if you store these queries in a KQL Queryset you can create alerts in Activator, so when something important happens you can be notified of it.

Read on to learn more.

Comments closed

The Showdown: Spark vs DuckDB vs Polars in Microsoft Fabric

Miles Cole puts together a benchmark:

There’s been a lot of excitement lately about single-machine compute engines like DuckDB and Polars. With the recent release of pure Python Notebooks in Microsoft Fabric, the excitement about these lightweight native engines has risen to a new high. Out with Spark and in with the new and cool animal-themed engines— is it time to finally migrate your small and medium workloads off of Spark?

Before writing this blog post, honestly, I couldn’t have answered with anything besides a gut feeling largely based on having a confirmation bias towards Spark. With recent folks in the community posting their own benchmarks highlighting the power of these lightweight engines, I felt it was finally time to pull up my sleeves and explore whether or not I should abandon everything I know and become a DuckDB and/or Polars convert.

Read on for the method and results from several thoughtful tests.

Comments closed

Ways to Land Data into Microsoft Fabric OneLake

James Serra puts on a cape and takes on an iconic laugh:

Microsoft Fabric is rapidly gaining popularity as a unified data platform, leveraging OneLake as its central data storage hub for all Fabric-integrated products. A variety of tools and methods are available for copying data into OneLake, catering to diverse data ingestion needs. Below is an overview of what I believe are the key options:

Read on for a baker’s dozen methods.

Comments closed

Fabric Studio 1.0

Gerhard Brueckl makes an announcement:

I am very proud to announce the first public release of Fabric Studio v1.0 – a VSCode extension that allows you to manage and develop your Fabric workspace(s). Similar to Power BI Studio, it seamlessly integrates into VSCode for increased productivity for professional developers and admins alike.

Click through for some of the functionality available in Fabric Studio. You can download the extension from the VS Code marketplace and Gerhard includes a link to the GitHub repo in the blog post.

Comments closed

Data Transformation with Dataflows Gen2

Boniface Muchendu provides an overview of Dataflows Gen2 in Microsoft Fabric:

Welcome to a journey into the world of data automation! Imagine working in an organization bustling with data scientists and analysts. In such an environment, you often need to gather and combine data from various sources for further analysis. You could do this manually, but why not leverage automation? In this blog, we’ll explore how to apply automation on data transformations using Dataflows Gen2 in Microsoft Fabric.

Admitting that I am not the primary audience for Dataflows Gen2, I’d still much rather write a Spark notebook and call it a day.

Comments closed

Geospatial Data Exploration in Microsoft Fabric

Sandeep Pawar goes on a journey:

Simon Willison is one of my favorite bloggers. In fact, what I blog, how I blog & test, is inspired by him. He wrote a blog a couple of weeks ago about FourSquare Places data that has been open-sourced. I was exploring this dataset and ended up creating a few maps. I love OrgApps in Fabric and I truly believe as it matures, it will be THE way for analysts & data scientists to provide rich insights + traditional reports to business users. Notebooks can augment the Power BI reports to provide insights that are otherwise not possible. I have submitted a session on this topic to FabCon ‘25, let’s see. If it is selected, I hope to show how transformational it is and how businesses can use it.

Click through for a video and the notebook that Sandeep demonstrated.

Comments closed

Refreshing a Power BI Semantic Model via Eventstreams

Chris Webb builds a Rube Goldberg device:

Following on from my last post where I showed how to send data from Power Automate to a Fabric Eventstream, in this post I’m going to show how to use it to solve one of my favourite problems: refreshing a Power BI semantic model that uses an Excel workbook stored in OneDrive as a source when that Excel workbook is modified.

Now before I go on I want to be clear that I know this is a ridiculously over-engineered and expensive (in terms of CUs) solution, and that you can do almost the same thing just using Power Automate or in several other different ways – see my colleague Mark Pryce-Maher’s recent videos on using Fabric Open Mirroring with Excel for example. I’m doing this to teach myself Fabric Eventstreams and Activator and see what’s possible with them. Please excuse any mistakes or bad practices.

Click through for the process.

Comments closed

Mounding ADF Instances in Microsoft Fabric

Koen Verbeeck has an existing Azure Data Factory:

We recently started using Microsoft Fabric for our cloud data platform. However, we already have quite an estate of Azure data services running in our company, including a huge number of Azure Data Factory (ADF) pipelines. It seems cumbersome to migrate all those pipelines to Microsoft Fabric, especially because some features are not supported yet and ADF is the mature choice at the moment. We like the concept of Microsoft Fabric’s centralization, where everything is managed in one platform. Is there an option to manage ADF in Fabric?

Read on for the answer, but make sure to check out its limitations as well.

Comments closed