Press "Enter" to skip to content

Day: May 21, 2026

Adding Foreign Keys and Deadlocks

Michael J. Swart explains a challenge in adding a foreign key to an existing table:

Schema modification locks (SCH-M) are taken by DDL (Data Definition Language) statements like CREATE/ALTER/DROP.
Schema stability locks (SCH-S) are taken by DML (Data Manipulation Language) statements like INSERT/UPDATE/DELETE.

Those two types of locks are incompatible. Meaning, I can’t get a SCH-S lock on some table if you’ve already got a SCH-M lock on it (and vice versa).
Paul Randal describes the SCH-M lock as a super-table-X lock. It makes sense to me, if I’m half way through querying a table, I don’t want its definition to change.

Such a pessimistic lock can be awkward for a busy system. The SCH-M can cause a lot of blocking. For example, creating (and dropping) foreign keys requires a SCH-M lock not only on the parent table, but also on the referenced table which leads to trouble.

Click through for a demonstration of the problem. Michael also has some guidance on how to minimize the issue. I’d note the degenerative form of this guidance: understand your data model up-front and apply foreign key constraints at table creation time. That’s not always possible, sure, so when you can’t do that, Michael has some good advice.

Leave a Comment

Explaining the Fabric Ontology

James Serra takes us through a big word:

For years, most data conversations have started with tables. We ask where the data lives, what columns are available, how the joins work, and whether the data is in a warehouse, lakehouse, semantic model, or some other system. That makes sense, because tables are how most of us have worked with data for decades. But tables are not how the business thinks.

A business thinks in terms of customers, products, orders, shipments, assets, flights, runways, employees, policies, and actions. The problem is not usually a lack of data. The problem is a lack of shared meaning. Organizations often have the same business concept represented multiple ways across teams and systems, creating what I would call semantic drift. Sales may define a customer one way. Finance may define it another way. Operations may have yet another version in a different system with different keys, names, and assumptions. That is exactly where Fabric Ontology becomes important. It is designed to close the gap between physical data structures and business meaning.

Microsoft is a bit late to the ontology game and their current concept of an ontology shows. I can understand where they’re going but they still have a ways to go.

Leave a Comment

Working with PIVOT and UNPIVOT in SQL Server

Ed Pollack explains a pair of operators:

There are few operators in T-SQL that cause developers to scramble for documentation more than PIVOT and UNPIVOT. Beyond documentation, transforming columns into rows (and vice-versa) can often be confusing and frustrating for those of us tasked with reformatting data for use by an application.

This article walks through PIVOT and UNPIVOT, providing examples of simple use cases for both – as well as some more complex scenarios we can run into in real-world data. These can be extraordinarily useful ways to reformat data efficiently and quickly with less code than the alternatives. So, there is no need to fear them again!

Click through for Ed’s article. I definitely don’t fear either PIVOT or UNPIVOT and they can be quite useful. But if you locked me in a room and I couldn’t leave until I came up with the proper syntax for both from memory, well, I’d be in that room for a while.

Leave a Comment

The Pain of NULL

Louis Davidson explains the unknown:

There is no simpler topic in relational comparisons than three valued logic. I am being mostly facetious about this, but in reality, it seems so simple that people don’t think about how a NULL works, and make mistakes all of the time. I was reading a post about this the other day on LinkedIn (which by no means could one ever find again!) where one of the comments chastised the author of the post for not understanding “the fundamentals” of relational theory. The original poster wasn’t completely right (and my post may not be completely perfect either, though I will back most of what I write with code.)

In this post I want to point out a few of the key basics that one really should understand.

Click through for a primer on what NULL means and doesn’t mean. And by the time you’re done, I’d like to interest you in the power of 6th normal form, where you can effectively banish NULL into the abyss (at least until you join the bits back together).

Leave a Comment

Finding Bad Queries with sp_QuickieCache

Erik Darling is on a hunt for bad queries:

That’s the funny part. Alright. Cool. With that out of the way, let’s look at this new store[d] procedure. Uh, I think I have to go to Management Studio. Yeah, I remember what that looks like. Alright. Cool. So, uh, this is, this is it. SPQuickieCache. Pay no attention to the terrible red squiggly underlines.

I think Erik made his AI business partner angry, as it didn’t strip out any of the filler words from the transcript. But this does look like a neat stored procedure.

Leave a Comment

CLUSTER BY in Microsoft Fabric Data Warehouse

Nikola Ilic shows off a relatively new feature:

The first thing every Fabric architect reaches for in this situation is the usual playlist: check the query plan, look at the joins, validate the statistics, maybe scale up the capacity. All worth doing, but none of those things addressed what was actually happening: the warehouse was scanning the entire table for every filtered query, because there was no way to tell it which Parquet files actually contained the rows we cared about.

However, Microsoft shipped data clustering in preview at the end of November 2025, and the entire conversation changed.

In this article, I want to walk you through what data clustering is, how it works under the hood, and most importantly, I’ll show you a real demo on a 100-million-row clickstream table that you can run in your own warehouse. No abstractions, no marketing numbers, but actual T-SQL you can paste.

Some of the notes Nikola mentions remind me of some of the rules around making columnstore indexes work and for much of the same reason. But as Nikola’s demo shows, this is definitely a “You must be this tall to ride the ride” feature, and unless you’re talking about quite large fact tables with (at a minimum) billions of rows of data, the benefit mostly comes from reducing CUs rather than wall clock time improvements.

Leave a Comment