Press "Enter" to skip to content

Author: Kevin Feasel

All about Lakehouses in Microsoft Fabric

Tomaz Kastrun gives us the skinny with multiple posts in his Advent of Microsoft Fabric. Day 3 introduces the lakehouse:

Lakehouse is cost-effective and optimised storage, supporting all types of data and file formats, structured and unstructured data, and helps you govern the data, giving you better data governance. With optimised and concurrent reads and writes, it gives outstanding performance by also reducing data movement and minimising redundant copy operations. Furthermore, it gives you a user-friendly multitasking experience in UI with retaining your context, not losing your running operations and working on multiple things, without accidentally stopping others.

Day 4 covers Delta format:

Yesterday we looked into lakehouse and learned that Delta tables are the storing format. So, let’s explore what and how we can go around understanding and working with delta tables. But first we must understand delta lake.

Day 5 covers data ingest:

We have learned about delta lake and delta tables. But since we have uploaded the file directly, let’s explore, how we can also get the data into lakehouse.

Click through for all three posts.

Comments closed

Joining on Overlapping Date Ranges in T-SQL

Daniel Hutmacher crosses the streams:

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

This problem gets even more challenging if you have the possibility of multiple overlaps and you want to find the combination with the biggest overlap for each individual item.

Comments closed

Integrating Azure ML and Power BI

I have a new video:

In this video, I show off how easy it is to integrate Azure ML and Power BI, at least once you get past all of the trouble trying to integrate them.

I expected this to be easy. It turns out that the “make it look easy” depends on having several things in place already and using the correct (by which I mean “old”) deployment type.

Comments closed

Producing Messages with librdkafka

Jakub Korab dives into a Kafka library:

In a previous blog post (How To Survive an Apache Kafka® Outage) I outlined the effects on applications during partial or total Kafka cluster outages and proposed some architectural strategies to handle these types of service interruptions. The applications most heavily impacted by this type of outage are external interfaces that receive data, do not control request flow, and possibly perform some form of business transaction with the outside world before producing to Kafka. These applications are most commonly found in finance and written in languages other than Java—mostly C and C++. 

librdkafka is the main underlying client library used in non-JVM environments and has wrapper libraries for Python, .Net, Go, and an ever-expanding list of clients. It has not been written about to the same extent as the Java client, and it is worth examining as its interface and underlying mechanics are fundamentally different. 

This library is quite useful and versatile.

Comments closed

Finding the Local Port Number for Power BI Desktop

Soheil Bakhshi updates an older post:

In March 2018, I wrote a blogpost called Four Different Ways to Find Your Power BI Desktop Local Port Number. Last week, Zoe Doughlas from Microsoft left a comment reminding me of a fifth method to get the port which encouraged me to write this quick tip. Thanks to Zoe!

As the name suggests, the blog was about finding Power BI Desktop’s local port number. If you do not have any clue what I mean by local port number, I strongly suggest reading that blog.

Read on to see what that fifth method is.

Comments closed

Optimizing Sort Operators in Window Functions

Andy Brownsword talks about window function query tuning:

We’re on quite a roll with window functions these past few weeks. Last week we looked at the operators we’d see in execution plans when using a window function. This week I wanted to tackle one of the more troublesome ones specifically: the Sort operator.

We know that sort operators are expensive in our queries. To use a window function our data needs to be sorted. How about if we need multiple functions? What if we’d like the output sorted too? Can we optimise any of those out of the execution plan?

Read on for several tips.

Comments closed

Accessing PostgreSQL from Python

Semab Tariq connects to Postgres:

Psycopg2, a PostgreSQL adapter for Python, implements the Python Database API Specification v2.0, acting as a bridge between Python applications and PostgreSQL databases. It leverages the libpq library, the official PostgreSQL C interface, to facilitate efficient communication. Psycopg2 provides a robust set of features, including transaction management, and support for PostgreSQL-specific data types. Its implementation of the Python DB API ensures seamless integration, enabling developers to execute SQL queries and transactions with precision in Python applications.

Read on to see how it works using a variety of examples.

Comments closed

ANSI and SET Options in Stored Procedures

Erik Darling starts a new series:

This aligns my stored procedures with the necessary settings to accomplish a couple things:

  • Allow the optimizer to use indexed views, filtered indexes, and computed columns
  • Avoid errors when modifying tables involved with indexed views, filtered indexes, and computed columns

Click through to see what Erik sets by default. It’s a good list, though a bit more than I think to do. Probably because I haven’t been burned enough yet.

Comments closed

Building a Radar Chart in Power BI with SVG

Stephanie Bruno doesn’t need a built-in radar chart visual:

Radar (or spider) charts are a way to look at multiple metrics, perhaps with a different range of values for each metric, on a single chart. In this example, we’ll look at characteristics of Taylor Swift songs from a Spotify dataset (I have a daughter who still hasn’t forgiven me for not getting tickets to the Eras tour, so hopefully this will make up for it). A matrix with the radar SVG allows us to quickly compare these song characteristics (you can get the dataset and the descriptions of the characteristics here). There are existing radar/spider custom visual charts that are great, but none of them currently have a small multiple option, so we can’t use them to create the visual below, for example.

Click through to see the full example.

Comments closed

Visualizing a Power BI Refresh with the semantic-link Library

Phil Seamark builds a notebook:

A few blogs back I shared a technique using Power BI Profiler (or VS Code) to run and capture a trace over a refresh of a Power BI semantic model (the object formally known as a dataset).

I’ve since received a lot of positive feedback from people saying how useful it was to visualize each internal step within a problematic Power BI refresh. Naturally, in the age of Fabric, I’m keen to share how the same approach works using a Microsoft Fabric Notebook.

Click through to see how you can do it.

Comments closed