Press "Enter" to skip to content

Day: October 1, 2024

Managing Power BI Assets with semantic-link-labs

Kurt Buhler takes us through a Python library:

Thus far, the part of Microsoft Fabric that I’ve personally found the most interesting is not Copilot, Direct Lake, or its data warehousing capabilities, but a combination of notebooks and simple file/table storage via Lakehouses. Specifically, the library semantic link and its “expansion pack” semantic-link-labs, spearheaded by Michael Kovalsky. These tools help you build, manage, use, and audit the various items in Fabric from a Python notebook, including Power BI semantic models and reports.

Semantic-link-labs provide a lot of convenient functions that you can use to automate and streamline certain tasks during Power BI development; both of models and reports. For me, I’m particularly interested in the reporting functionalities, because this is where I typically find that I lose the most time, and because there is a drought of tools to address this area.

Read the whole thing.

Leave a Comment

The Internals of Data Updates in PostgreSQL

Cary Huang explains how update operations work:

In previous blogs, we talked about an overview of PostgreSQL’s table access method API here , how sequential scan is handled within this API here, and how data insertion is handled here. Today in this blog, we will look closely into how PostgreSQL handles update. A successful update in PostgreSQL can be viewed as “insert a new record” while “marking the old record as invisible” due to the MVCC technique that PostgreSQL employs. It sounds simple enough, but there are quite a lot of considerations in place to make a successful update. Let’s dive in.

There’s a lot going on behind the scenes, and Cary does a good job of explaining it all.

Leave a Comment

Techniques for Unpivoting Data in SQL Server

Jared Westover performs a technique showdown:

A few weeks ago, I helped someone combine multiple integer columns into a single column. The common term for this process is unpivoting or transposing. The table they worked with had millions of rows, so they needed the most efficient method to accomplish this task. If you search online, you’ll find several suggestions for doing this. But which performs best? Is one easier to maintain than the others?

Click through for a review of three separate techniques: using the UNPIVOT operator, using UNION ALL, and using CROSS APPLY. The dataset was relatively small, but even at that size, CROSS APPLY did a good job. But I won’t spoil too much here.

Leave a Comment

Query Store Not Capturing Runtime Stats for Unfinished Queries

Matt Changchien does a bit of testing:

It’s beneficial to understand how Query Store captures query runtime stats when they are used for performance troubleshooting. According to public documentation: upon the query execution, runtime statistics are sent to Query Store.

I am curious whether Query Store will still capture execution statistics under certain scenarios. Hence, in this blog post, I will be testing the two following scenarios with my Azure SQL Database:

  1. Whether Query Store captures runtime stats for query that has completed within an uncommitted transaction, then the transaction rolls back due to a scaling process.
  2. Whether Query Store captures runtime stats for query that is blocked and does not finish before a disconnect caused by a disruptive event (e.g., a service tier scale-up).

Read on to learn more about these two scenarios. The specific context is Azure SQL Database, though the same rules would apply on-premises as well.

Leave a Comment

Not all Window Function Calls are Deterministic

Chen Hirsh reminds us of an important fact:

One of the scariest things that can happen when you develop SQL code is getting indeterminate results. When you run your query multiple times and each time gets a different value.

How can that happen?

SQL functions or queries can be categorized into two groups – deterministic vs. non-deterministic.

I even have a demo where I show off the non-deterministic nature of ROW_NUMBER() versus, say, RANK() or DENSE_RANK(). But there’s nothing too scary about it: just ensure that your PARTITION BY criteria are guaranteed to be unique and you’re good to go. H/T Madeira Data Solutions blog.

Leave a Comment

Test those Backups

Grant Fritchey provides an important public service announcement:

Please, let me reiterate: The only valid test of a backup is a restore.

THE

ONLY

VALID

TEST

OF

A

BACKUP

IS

A

RESTORE.

Grant is absolutely correct here. And do you know how hard it is to test a SQL Server backup? With dbatools, it’s this hard: Test-DbaLastBackup. That one cmdlet restores a backup (under a different name so it doesn’t clobber your existing database), runs DBCC CHECKDB against the restored data, drops the test database after the fact, and even allows you to do all of that against a different SQL Server instance.

Incidentally, my next YouTube series just happens to be around backup and restoration in SQL Server, so stay tuned if you’re interested.

Leave a Comment