Press "Enter" to skip to content

Category: Microsoft Fabric

Verti-Parquet and DirectLake in Fabric

Jordan Witcombe provides an explanation:

The VertiPaq engine cleverly uses columnar storage for efficient querying and processing. It employs multiple compression techniques, including Run-Length Encoding (RLE) and Dictionary Encoding, to minimise storage space. Through finding optimal sort orders and value encoding, it achieves maximum space efficiency and performance. VertiPaq also utilises ‘In-Memory Column Store’ for fast query performance, ‘Predicate Pushdown’ to eliminate unnecessary data at query time, and ‘Block Decompression’ to only decompress relevant data blocks, making it a powerhouse for data management and retrieval.

Now, because of these ingenious tricks, we wave goodbye to traditional file formats like JSON or CSV. Instead, all data stored within the managed area of Fabric and OneLake uses either Parquet or Delta. It’s time to embrace these efficient, high-performing formats that bring the best out of VertiPaq’s compressive power. Let’s explore these further in the next section.

Read on for some comparisons in file size between Fabric and Databricks, as well as how they perform in Power BI.

Comments closed

Connecting to a Fabric Warehouse via SSMS

Reitse Eskens does some digging:

Whilst working on a blogpost on Fabric Data Warehouse, I started wondering if I could work around the SQL web interface and connect to my OneLake with SSMS and/or ADS. As it turns out, you can!

Specifically, you can connect to see things in a warehouse or the Tables view of a lakehouse, not the Files view. There is a built-in web viewer, but Microsoft Fabric definitely is intended to work with normal SQL tools, not just its web interface and Power BI.

Comments closed

Creating a Microsoft Fabric Environment

Kevin Chant gets at it:

In reality, there are a few different ways to join the Microsoft Fabric (Preview) trial.

For example, you might be lucky enough to have it enabled in the workplace already. However, there are ways that you can create your own Microsoft Fabric environment as well.

Click through for the process, and note that the trial is 60 days, though Microsoft will let you renew the trial until the product goes GA.

Comments closed

Contrasting Lakehouse, Warehouse, and Datamart in Fabric & Power BI

Reza Rad disambiguates three terms:

Three types of objects in the Microsoft Fabric have similarities in what they can do for an analytics system. These three are; Lakehouse, Data Warehouse, and Power BI Datamart. All three objects provide storage for the data, which can be loaded into them using an ETL process and read using something like a Power BI report. In this article and video, I’ll explain the actual differences and how to choose the best option for your implementation and architecture.

Reza does a good job explaining when each of the three fit in and even has a nice chart to work out which one you might want to use.

Comments closed

Roles and Domains in Microsoft Fabric

Marc Lelijveld explains two key concepts:

Microsoft Fabric is out there for a few weeks now. With the release of Fabric, a new concept in line with data-mesh architectures became available in Fabric, or Power BI if you will. With the introduction of Domains, we have a new level of controls added next to existing roles. In this blog I will further elaborate on the levels of control that are available today and provide a clear overview of these different levels.

There’s going to be a bit of nomenclature adjustment for people who have spent most of their time in Synapse or other platforms moving to Fabric. If you’ve already spent most of your time in Power BI, this shift is probably a little easier.

Comments closed

Licensing for Microsoft Fabric

Reza Rad explains how licensing of Microsoft Fabric will work:

To understand the licensing for Microsoft Fabric, You first need to understand the Capacity structure. In Fabric, there are three important sections that the content can be organized into those; Tenant, Capacity, and Workspace.

Tenant is the most fundamental part of the structure of Fabric. Each domain can have one or multiple tenants.

The capacity is the substructure under the tenant. You can have one or multiple capacities in each tenant. Each capacity is a pool of resources that can be used for Microsoft Fabric services. There are different SKUs for different levels of resources. I’ll explain the pricing and SKUs shortly after.

Inside capacities, you will have workspaces. Workspaces are sharing units that will be used for developers and users. For example, you will create Lakehouse, Data Pipeline, and Dataflow inside a workspace, and you can share them with the rest of the developer team. A workspace is assigned to a capacity. However, you can have more than one capacity associated with one workspace. The screenshot below shows how Tenant, Capaicy, and Workspace work together.

Read on to understand at what level billing occurs, what the options are, and what it means. My gut is saying that F8 is probably the lowest acceptable tier for a real company’s production environment and F2 is more for dev environments or people trying things out. But we’ll know more, I think, in the next few months as people try things out.

Comments closed

Configuring Compliance in Microsoft Fabric

Kevin Chant checks a box:

Compliance is a very important aspect when working for data. Especially when you must work to standards like PCI-DSS. With this in mind I looked into the compliance story for Microsoft Fabric.

By the end of this post, you will have a better idea of how to test configuring compliance for Microsoft Fabric. Along the way I share plenty of links.

Read on for step-by-step instructions, as well as those links.

Comments closed

Thoughts on Fabric OneLake

Teo Lachev shares some thoughts:

In a previous post, I shared my overall impression of Fabric. In this post, I’ll continue exploring Fabric, this time sharing my thoughts on OneLake. If you need a quick intro to Fabric OneLake, the Josh Caplan’s “Build 2023: Eliminate data silos with OneLake, the OneDrive for Data” presentation provides a great overview of OneLake, its capabilities, and the vision behind it from a Microsoft perspective. If you prefer a shorter narrative, you can find it in the “Microsoft OneLake in Fabric, the OneDrive for data” post. As always, we are all learning and constructive criticism would be appreciated if I missed or misinterpreted something.

I think some of Teo’s criticism comes from the idea that OneLake should also mean one lakehouse or one data lake, but the abstraction is one level higher than that. I would like to see some of Teo’s ideas make it into GA, though.

Comments closed

Microsoft Fabric and Process Unification

Paul Andrew gets to the heart of things:

Moving on and assuming you have seen the event sessions, I want to give you my point of view to help explain what Microsoft Fabric is. Firstly, lets clear up call out was terminology to support this understanding. Is this software offering a resource, service, platform, or solution? To answer this question, perspective is key, perspective with a timeline (2018 to 2023). We could simply say that Microsoft Fabric is all these things. All things to all data professionals and beyond. But, to understand this, let’s consider the journey Microsoft has been on and how this technology has evolved. I believe this journey is the best way to help explain what Microsoft Fabric is, rather than focusing on all the new and shiny bits.

Click through for Paul’s take on the matter and how this whole area of “modern data warehousing” has evolved over the past several years in Azure.

Comments closed