Press "Enter" to skip to content

Category: Microsoft Fabric

Notes on Data Engineering in Microsoft Fabric

John Miner shares some notes. Part 1 looks at getting started and tables, both managed and unmanaged:

The architectural diagram shows how information flows from a source system, into a delta lake house, transformed by programs, and used by end users. To get source data into the lake, we can use any of the three methods to retrieve the data as files: pipelines – traditional Azure Data Factory components, dataflows – wrangling data flows based on Power Query and shortcuts – the ability to link external storage to the lake. Once the data is in the lake, there are two types of programs that can transform the data files: spark notebooks and data flows.

Part 2 covers file and folder management:

In practice, I have seen an additional quality zone called raw be used to stage files in their native format before converting to a delta file format. Please note, the lake house uses either shortcuts or pipelines to get files into the lake. We will talk more about bronze, silver and gold zones when I cover full and incremental loading later in this article.

Read on for John’s thoughts.

Comments closed

Viewing DAX in Microsoft Fabric with SemPy

Kevin Chant talks about a recent issue:

Recently I have been helping others get up to speed with Microsoft Fabric. Which includes going through some Power BI topics.

One issue that came up was how to show them the DAX used for a measure within a Power BI report that had been published to Microsoft Fabric. To link working with measures in Power BI Desktop with working in Microsoft Fabric.

Kevin shows the normal way of doing this, as well as an alternative using the SemPy library.

Comments closed

Accessing the Purview Portal in Your Fabric Environment

Kevin Chant enables a feature:

In this post I want to cover accessing the new Microsoft Purview portal in your own Microsoft Fabric environment.

To clarify, I mean a Microsoft Fabric environment you have created for your own use. Like the one I covered in a previous post.

You can do this in a trial environment thanks to the new capability provided by Microsoft last year to infuse Microsoft Fabric items into Microsoft Purview. Which Microsoft covered in a blog post about Microsoft Fabric items in Microsoft Purview.

Read on to see how.

Comments closed

Automating Microsoft Fabric Capacity Scaling via Logic App

Soheil Bakhshi does some scaling:

In a previous post I explained how to manage the capacity costs of a Fabric F capacity (under Pay-As-You-Go pricing model) using Logic Apps to Suspend and Resume it.

A customer who read my previous blog asked me “Can we use a similar method to scale up and down before and after specific workloads?”. This blog post is to answer exactly that.

This is pretty neat, though I wonder how long it takes and how much downtime it produces.

Comments closed

Generating Synthetic Data for Streaming in Microsoft Fabric

Sandeep Pawar builds out some data:

If you want to learn or demo Real Time Analytics in Microsoft Fabric, you will need a streaming data source. You can use the built-in samples to get started. But there are several data generators which you can use to create custom streaming sample datasets, Azure Stream Analytics data generator being one of them. You can see them here. In this blog, I will show how to set one up to use with Fabric Eventstream.

Read on for a step-by-step guide.

Comments closed

VARCHAR() in Microsoft Fabric Lakehouses and SQL Endpoints

Gerhard Brueckl models some data:

Defining data types and knowing the schema of your data has always been a crucial factor for performant data platforms, especially when it comes to string datatypes which can potentially consume a lot of space and memory. For Lakehouses in general (not only Fabric Lakehouses), there is usually only one data type for text data which is a generic STRING of an arbitrary length. In terms of Apache Spark, this is StringType(). While this applies to Spark dataframes, this is not entirely true for Spark tables – here is what the docs say:

Read through for more information on that, as well as how to define a table in a Microsoft Fabric lakehouse using VARCHAR(). The display is a little weird, but Greg Low explains why in the comments.

Comments closed

Parallelizing Notebook Runs in Microsoft Fabric via Python

Sandeep Pawar kicks off multiple notebooks at once:

The notebook class in mssparkutils has two methods to run notebooks – run and runMultiple . run allows you to trigger a notebook run for one single notebook. Mim wrote a nice blog to show how to use it and its usefulness.

runMultiple , on the other hand, allows you to create a Direct Acyclic Graph (DAG) of notebooks to execute notebooks in parallel and in specified order, similar to a pipeline run except in a notebook.

Read on to learn more about the advantages of this latter approach as well as how you can do it.

Comments closed

Microsoft Fabric Free Trial Capacities

Soheil Bakhshi digs into the fine print:

If you are evaluating Microsoft Fabric and do not currently own a Premium Capacity, chances are you’re using Microsoft Fabric Trial Capacities. All Power BI users within an organisation or specific security groups given the rights can opt into Fabric Trial Capacities. Therefore, you may already have several Trial Fabric Capacities in your tenant. Your Fabric Administrators can specifically control who can opt into the Fabric Trial capacities within the Fabric Admin Portal, on the Help and support settings section, and enabling the Users can try Microsoft Fabric paid features setting as shown in the following image:

Read on for a lot more detail, including several common issues you might find along the way.

Comments closed