Data Lake – Page 6 – Curated SQL

An Overview of Lakehouses in Microsoft Fabric

Published 2023-12-11 by Kevin Feasel

Kevin Chant invites you to a swank lakhouse:

By the end of this post, you will have a good overview of Microsoft Fabric Data Lakehouses, including CI/CD options. In addition, where your SQL Server background can prove to be useful and where some Power BI knowledge can come in handy.

Plus, I share plenty of links in this post. For instance, there are a couple of useful resources to help you get started towards the bottom of this post.

Click through for the article.

Comments closed

All about Lakehouses in Microsoft Fabric

Published 2023-12-07 by Kevin Feasel

Tomaz Kastrun gives us the skinny with multiple posts in his Advent of Microsoft Fabric. Day 3 introduces the lakehouse:

Lakehouse is cost-effective and optimised storage, supporting all types of data and file formats, structured and unstructured data, and helps you govern the data, giving you better data governance. With optimised and concurrent reads and writes, it gives outstanding performance by also reducing data movement and minimising redundant copy operations. Furthermore, it gives you a user-friendly multitasking experience in UI with retaining your context, not losing your running operations and working on multiple things, without accidentally stopping others.

Day 4 covers Delta format:

Yesterday we looked into lakehouse and learned that Delta tables are the storing format. So, let’s explore what and how we can go around understanding and working with delta tables. But first we must understand delta lake.

Day 5 covers data ingest:

We have learned about delta lake and delta tables. But since we have uploaded the file directly, let’s explore, how we can also get the data into lakehouse.

Click through for all three posts.

Comments closed

Lakehouse Management in Fabric via mssparkutils

Published 2023-11-21 by Kevin Feasel

Sandeep Pawar scripts out some lakehouse work:

At MS Ignite, Microsoft unveiled a variety of new APIs designed for working with Fabric items, such as workspaces, Spark jobs, lakehouses, warehouses, ML items, and more. You can find detailed information about these APIs here. These APIs will be critical in the automation and CI/CD of Fabric workloads.

With the release of these APIs, a new method has been added to the mssparkutils library to simplify working with lakehouses. In this blog, I will explore the available options and provide examples. Please note that at the time of writing this blog, the information has not been published on the official documentation page, so keep an eye on the documentation for changes.

This looks to be quite useful for CI/CD work.

Comments closed

Fun with Tables in the Microsoft Fabric Lakehouse

Published 2023-11-03 by Kevin Feasel

Nikola Ilic dives into tables:

Probably the biggest confusion is: should I use a lakehouse or warehouse in Fabric? Or, what is the difference between Direct Lake and DirectQuery mode for Power BI reports?

And, while these two points mentioned above are of paramount importance to clarify, in this article I’ll focus on explaining another potential caveat, which is relevant when working with the lakehouse in Microsoft Fabric.

If only Nikola dove onto tables, I could make him an honorary Buffalo Bills fan.

Comments closed

An Overview of Microsoft Fabric’s OneLake

Published 2023-10-05 by Kevin Feasel

Reza Rad has a video and an article:

Microsoft Fabric is a complete set of technologies that provide Analytics as a service. Fabric uses a logical storage layer named OneLake. In this article, we will explore what OneLake is, what it is important, its essential features, and how it works with the rest of Fabric objects.

Check out the video or read through the article.

Comments closed

Exporting Dynamics 365 Data into Delta Lake via Synapse Link

Published 2023-09-27 by Kevin Feasel

Jose Mendes performs a data migration:

It’s fair to say there have been some considerable changes in the Azure landscape over recent years.

This blog will show you how to configure Synapse Link to export D365 data in the Delta Lake format – an open-source data and transaction storage file format used in Lakehouse implementations.

Before you start considering using this approach, you will need to ensure you meet the following prerequisites (Microsoft documentation).

Read on for those prerequisites as well as a step-by-step guide on how to do it.

Comments closed

Fixing Microsoft Fabric V-Order Optimization

Published 2023-09-21 by Kevin Feasel

Dennes Torres asks and answers a question:

I explained in a previous article how the Tables in a lakehouse are V-Order optimized. We noticed this configuration depends on our settings, which can be enabled or not.

One question remains: How could we check if the tables are V-Order optimized or not?

Read on for the answer, as well as a link containing more information on V-Order optimization.

Comments closed

Storing Log Analytics Data in the Microsoft Fabric Lakehouse

Published 2023-08-31 by Kevin Feasel

Gilbert Quevauvilliers needs a place to store this data:

Following on in my series, in this blog post I am going to use the dataflow Gen2 in Microsoft Fabric to load the data into a lake house table.

By doing this, it will allow me to store the data in a delta lake table.

In this series I am going to show you all the steps I did to have the successful outcome I had with my client.

Click through for links to the first two parts of the series, as well as a step-by-step guide for part 3.

Comments closed

Shortcuts in Microsoft Fabric

Published 2023-08-30 by Kevin Feasel

Adam Saxton explains the power of shortcuts:

We feel shortcuts are one of the most power capabilities within OneLake in Microsoft Fabric! Adam walks through what these are and how you can use them.

Click through for a video and a couple of Microsoft Learn links on the topic of shortcuts.

Comments closed

Accessing OneLake Files from Power BI Desktop

Published 2023-08-28 by Kevin Feasel

Marc Lelijveld reads a file:

Fabric content is all over the place by now. In Fabric, as a SaaS platform, most (if not all) services have interconnectivity. In a few clicks you connect your web-developed Power BI dataset to a lakehouse, or warehouse to fetch data from OneLake. But what about Power BI Desktop? You might have uploaded some files to OneLake which you cannot access from Power BI Desktop.

In this blog I’ll explain on how you can connect to OneLake data using Power BI Desktop!

This turns out to be a bit trickier than I would have expected. Hopefully the experience gets better over time.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Category: Data Lake