Press "Enter" to skip to content

Category: Microsoft Fabric

Trying out Data Wrangler

Ginger Grant tries out a feature in Microsoft Fabric:

The second element in my series on new Fabric Features is Data Wrangler. Data Wrangler is an entirely new feature found inside of the Data Engineering and Machine Learning Experience of Fabric. It was created to help analyze data in a lakehouse using Spark and generated code. You may find that there’s a lot of data in the data lake that you need to evaluate to determine how you might incorporate the data into a data model. It’s important to examine the data to evaluate what the data contains. Is there anything missing? Incorrectly data typed? Bad Data? There is an easy method to discover what is missing with your data which uses some techniques commonly used by data scientists. Data Wrangler is used inside of notebooks in the Data Engineering or Machine Learning Environments, as the functionality does not exist within the Power BI experience.

Click through to see how it works. I liken it to Power Query for people who don’t like Python.

Comments closed

Granting Users Access to Create Fabric Items

Gilbert Quevauvilliers is in a giving mood:

I was recently working with a customer where I was showing them the awesome new features of Microsoft fabric. I then created a workspace and attempted to grant the individual users access to the workspace to create fabric items or workloads.

When the users went into the app workspace with the fabric settings, enabled, the users could not create any workloads.

Read on to see what the problem was and how you can resolve it.

Comments closed

Create and Connect to a Fabric Data Warehouse

Olivier Van Steenlandt builds a warehouse:

In this data recipe series, Microsoft Fabric – Data Warehouse will be explored. As a starting point, a blank Fabric workspace is used. You can sign up for a free Fabric trial by using the following URL: Data Analytics | Microsoft Fabric

In this data recipe, we will create a brand-new Data Warehouse in Fabric. Once created, we will connect to our Data Warehouse using Azure Data Studio.

Click through for the step-by-step process.

Comments closed

Shortcuts in Microsoft Fabric

Koen Verbeeck takes a shortcut:

A while ago I had a little blog post series about cool stuff in Snowflake. I’m doing a similar series now, but this time for Microsoft Fabric. I’m not going to cover the basics of Fabric, hundreds of bloggers have already done that. I’m going to cover little bits & pieces that I find interesting, that are similar to Snowflake features or something that is an improvement over the “regular” SQL Server or related products.

In this blog post I’m going to talk about shortcuts

Read on to learn more about this feature.

Comments closed

Comparing Fabric F2 to F64

Reitse Eskens enters austerity mode:

If you’ve been having fun with Microsoft Fabric, chances are you’ve been playing around with the F64 capacity trial. This one is given to you by Microsoft for free but, since the GA data, the timer attached to it is counting down the days until you need to buy your own.

Read on to see what happens when you lose out on that sweet F64 goodness. I actually do appreciate the way that Fabric works: it’s not a linear scale of “F2 means you get 1/32 the processing power of F64.” Rather, it’s closer to time slices on a mainframe: F64 gets you a bigger slice. So if you’re a small shop without an enormous amount of data, F2 really does work pretty well.

Comments closed

A Primer on Direct Lake

Ginger Grant talks about a Fabric feature not in Power BI or Synapse:

With the general availability release of Fabric in November 2023, I am dedicating several posts to the features that are only in Fabric and not anywhere else. The first feature is Direct Lake. Direct Lake was created to address problems with Power BI Direct Query. Anyone who has used Direct Query knows what I am talking about. If you have implemented Direct Query, I am guessing you have run into one or all of these problems, including managing the constant hits to the source database which increase with the more users you have, user complaints about slow visuals, or the need to put apply buttons on all of your visuals to help with speed. Direct Query is a great idea. Who wants to import a bunch of data into Power BI? Directly connecting to the database sounds like a better idea, until you learn that that the data goes from Power BI to the database then back for each user one at a time, which means that Power BI must send more queries the more people are accessing reports. Users want to be able to access data quickly, have it scale well, and have access to the latest data.

Click through to learn more about Direct Lake.

Comments closed

Reading and Writing JSON Files in Microsoft Fabric

Tom Martens performs two of the three R’s:

JSON is a straightforward, text-based data-interchange format fully independent of any programming language. This simplicity is why JSON documents have become so widespread.

Often the result of querying a REST API is returned as a JSON document, but this is not the only use case why I consider it important to get familiar with reading from and writing to a JSON file. I think a JSON document is ideal for ingesting values into a Python script or any code, values that are helpful to control the logic and behavior of the script. But of course, it’s also easy to store the value of a variable inside the JSON document, where this value will be used when the script is running again.

Click through for some Python code to perform these operations. There are about a dozen other methods you can use as well, but this is one of the best.

Comments closed

Fabric and Databricks

A rare two-part compare and contrast!

First, Chen Hirsh directly contrasts Microsoft Fabric and Databricks:

Microsoft recently announced the general availability of Microsoft Fabric, which contains all (or most) cloud Data analytics services from Microsoft. This is a good opportunity to compare it with another popular data platform, which is also available in Azure (and other cloud services) – Databricks.

Before we start, I should note that Fabric is quite new, and it’s still hard to evaluate its performance and stability. Also, both products have many features, and I only try to discuss the main differences.

Then, Eugene Meidinger keys us in on the similarities:

One of the things that helps to understand Fabric is that it’s heavily influenced by Databricks. It’s built on delta lake, which is created and open sourced by Databricks 2019. You are encouraged to use a medallion architecture, which as far as I can tell, comes from Databricks.

You will be a lot less frustrated if you realize that much of what’s going on with Fabric is a blend of open source formats and protocols, but also is a combination of the idiosyncrasies of Databricks and then those of Microsoft. David Gomes has good post about data lake file formats, and it’s interesting to imagine the parallel universe where Fabric is built on Iceberg (which is also based on Parquet files) instead of delta lake. (Note, I found this post from this week’s issue of Brent Ozar’s Newsletter)

Comments closed

Fabric Data Pipeline for Blob Storage CSV into Azure SQL DB

Andy Leonard loads some data:

In November 2023, I shared how to start learning Microsoft Fabric in a post titled Start a Fabric Free Trial. In December 2023, I shared how to Create a Workspace in Fabric. In this post, I document one way to create a pipeline to load data from a CSV file stored in Azure Blob Storage to Azure SQL Database in your new Fabric workspace.

Click through for some key assumptions, as well as the process.

Comments closed