Press "Enter" to skip to content

Category: Microsoft Fabric

Bring Fabric to the Data Lakehouse

Ust Oldfield ties together Databricks and Microsoft Fabric:

We’ve built countless Lakehouses for our customers and influenced the design of many more. With the advent of Fabric, many organisations with existing lakehouse implementations in Azure are wondering what changes Fabric will herald for them. Do they continue with their existing lakehouse implementation and design, or do they migrate entirely to Fabric?

For many, the answer will be to continue as-is. They’ve invested a lot of time and money in establishing a Lakehouse – to migrate now to a slightly different technology stack would be a very costly exercise! There also isn’t a need to migrate from a lakehouse implementation in Databricks to one in Fabric as there aren’t concrete benefits to be realised.

For those using Power BI as their semantic and reporting layers, as well as using Databricks SQL or Synapse Serverless as the serving layer, Fabric provides a perfect opportunity to rationalise the architecture and to bring about substantial performance gains through the Direct Lake connectivity and V-Order compression in Fabric.

Read on to see what Ust means, using a couple of architecture diagrams along the way.

Comments closed

Balancing Governance and Collaboration with Fabric

Marc Lelijveld makes it sound like I can’t just say “No!” to everything as a Microsfot Fabric administrator:

Frequently, I am approached by curious individuals who inquire about my job and how I contribute to the success of our customers, especially since I am not directly involved in building solutions for each and every one of them. These questions have made me realize that it might be interesting to share insights into my role as a Fabric Administrator, or as some may refer to it, a Power BI Administrator.

In this blog post, I aim to shed light on the essence of daily activities of a Fabric Administrator, the meaningful conversations people in this role engage in, and the additional value they bring to the table.

Read on to see what people like Marc do all day.

Comments closed

Preliminary Thoughts on Microsoft Fabric in Preview

Reitse Eskens shares some initial thoughts:

So, these preliminary opinions I’m offering now are based on the preview I’ve worked with and will keep on working with.

That’s the first observation, I’ll keep on working with this. Why? To be honest I think it’s a step forward from the Data Factory, Synapse, PowerBI experience. Everything together in one product makes life easier. Even though I’m having a really hard time adopting to the interface. I keep selecting the wrong buttons to get stuff done. Then again, only being able to do this after working hours and during the weekend may have something to do with that. But making the interface a little more intuitive would really help me.

Read on for what Reitse has to share so far.

Comments closed

A Review of Fabric Lakehouse

Teo Lachev talks lakehouses:

The Microsoft’s Lakehouse definition is less ambitious and exclusive. “Microsoft Fabric Lakehouse is a data architecture platform for storing, managing, and analyzing structured and unstructured data in a single location. It is a flexible and scalable solution that allows organizations to handle large volumes of data using a variety of tools and frameworks to process and analyze that data. It integrates with other data management and analytics tools to provide a comprehensive solution for data engineering and analytics”. In other words, a lakehouse is whatever you want it to be if you want something better than a data lake.

Read on for Teo’s classic The Good, The Bad, and The Ugly format.

Comments closed

A Primer on Microsoft Fabric Notebooks

Leila Etaati provides an explanation:

In Fabric, there are tools for different personas of the users to work with. For example, for a citizen data analyst, Dataflows and Power BI Datasets are the tools with which the analyst can build the data model. For Data Engineers and Scientists, one of the tools is Notebook.

The Notebook is a place to write and run codes in languages such as; PySpark (Python), Spark (Scala), Spark SQL, and SparkR (R). These languages are usually familiar languages for data engineers and data scientists. The Notebook provides an editor to write code in these languages, run it in the same place, and see the results. Consider this as the coding tool for the data engineer and scientist.

Click through for a video, as well as a regular blog post.

Comments closed

Verti-Parquet and DirectLake in Fabric

Jordan Witcombe provides an explanation:

The VertiPaq engine cleverly uses columnar storage for efficient querying and processing. It employs multiple compression techniques, including Run-Length Encoding (RLE) and Dictionary Encoding, to minimise storage space. Through finding optimal sort orders and value encoding, it achieves maximum space efficiency and performance. VertiPaq also utilises ‘In-Memory Column Store’ for fast query performance, ‘Predicate Pushdown’ to eliminate unnecessary data at query time, and ‘Block Decompression’ to only decompress relevant data blocks, making it a powerhouse for data management and retrieval.

Now, because of these ingenious tricks, we wave goodbye to traditional file formats like JSON or CSV. Instead, all data stored within the managed area of Fabric and OneLake uses either Parquet or Delta. It’s time to embrace these efficient, high-performing formats that bring the best out of VertiPaq’s compressive power. Let’s explore these further in the next section.

Read on for some comparisons in file size between Fabric and Databricks, as well as how they perform in Power BI.

Comments closed

Connecting to a Fabric Warehouse via SSMS

Reitse Eskens does some digging:

Whilst working on a blogpost on Fabric Data Warehouse, I started wondering if I could work around the SQL web interface and connect to my OneLake with SSMS and/or ADS. As it turns out, you can!

Specifically, you can connect to see things in a warehouse or the Tables view of a lakehouse, not the Files view. There is a built-in web viewer, but Microsoft Fabric definitely is intended to work with normal SQL tools, not just its web interface and Power BI.

Comments closed

Creating a Microsoft Fabric Environment

Kevin Chant gets at it:

In reality, there are a few different ways to join the Microsoft Fabric (Preview) trial.

For example, you might be lucky enough to have it enabled in the workplace already. However, there are ways that you can create your own Microsoft Fabric environment as well.

Click through for the process, and note that the trial is 60 days, though Microsoft will let you renew the trial until the product goes GA.

Comments closed

Contrasting Lakehouse, Warehouse, and Datamart in Fabric & Power BI

Reza Rad disambiguates three terms:

Three types of objects in the Microsoft Fabric have similarities in what they can do for an analytics system. These three are; Lakehouse, Data Warehouse, and Power BI Datamart. All three objects provide storage for the data, which can be loaded into them using an ETL process and read using something like a Power BI report. In this article and video, I’ll explain the actual differences and how to choose the best option for your implementation and architecture.

Reza does a good job explaining when each of the three fit in and even has a nice chart to work out which one you might want to use.

Comments closed

Roles and Domains in Microsoft Fabric

Marc Lelijveld explains two key concepts:

Microsoft Fabric is out there for a few weeks now. With the release of Fabric, a new concept in line with data-mesh architectures became available in Fabric, or Power BI if you will. With the introduction of Domains, we have a new level of controls added next to existing roles. In this blog I will further elaborate on the levels of control that are available today and provide a clear overview of these different levels.

There’s going to be a bit of nomenclature adjustment for people who have spent most of their time in Synapse or other platforms moving to Fabric. If you’ve already spent most of your time in Power BI, this shift is probably a little easier.

Comments closed