Press "Enter" to skip to content

Category: Performance Tuning

An Introduction to Memory Grants in SQL Server

Erik Darling has a new video for us:

Erik monitoring tool mogul darling here with Darling Data. In today’s video, much like I think I foreshadowed in yesterday’s office hours video, we are going to talk about memory grants. We’re going to do a somewhat gentle introduction to them and then in the next video we’ll talk a little bit more about where they get interesting. 

Erik shares his perfectly reasonable take on the nature of strings. I’d probably also get rid of dates as well—too much confusion there for my taste.

Leave a Comment

The Benfit of Disabling V-Order in Fabric Dataflows Gen2

Chris Webb covers a specific use case:

Quite a few new Dataflows Gen2 features were released recently without much fanfare, but that doesn’t mean they aren’t important. I will take a look at them all in my next few posts; in this first post I’ll look at the ability to disable V-Order on staged data.

As the (very detailed) documentation for this new feature describes, V-Order is a write-time optimisation for the parquet files that underpin the Delta tables that OneLake uses to store data. It slows down writing data to the tables but means that reading data from them, for example in Power BI Direct Lake mode, is much faster. 

Click through to see how disabling V-Order can make certain staging loads faster.

Leave a Comment

Comparative Query Analysis and Planning

Christophe Pettus has a two-parter. The first post covers how a half-dozen data platform technologies handle cost-based analysis:

PostgreSQL has ANALYZE. You run it (or autovacuum runs it for you), it draws a sample of 300 × default_statistics_target rows, and it writes a row per column into pg_statistic: a null fraction, an n-distinct estimate, a most-common-values list, an equi-depth histogram, and a physical-vs-logical correlation. The planner reads those numbers, multiplies selectivities together, costs a handful of join strategies, and picks one. Three join algorithms are on the menu: nested loop, merge join, hash join.

That is the entire shape of the problem, and every cost-based optimizer ever shipped solves the same one. They differ in three places, and only three: where the numbers come from, how stale the numbers are allowed to get, and which plan shapes are even legal to choose between. The algorithms are the boring part. Everybody hash-joins. The interesting part is the bookkeeping.

Then there’s how each of the systems generates a query plan:

Statistics are the input. Planning is what the database does with them: it takes a declarative query, which describes what you want and says nothing about how, and turns it into an executable plan, which is nothing but how. There are two jobs inside that. First, rewrite the query into a logically equivalent but more tractable shape, which is where subquery flattening, predicate pushdown, and view merging live. Second, search the space of physical plans (join orders, join algorithms, access paths) for the cheapest one the cost model can find. The second job is the hard one, because the number of possible join orders for a query grows faster than anyone wants to contemplate, and every database in this article is, underneath, a strategy for not enumerating all of them.

Two questions separate the six systems here. How does each one tame that search space? And once it has an answer, how much will it let you argue with the result? Those sound like the same question. They are not, and the most useful thing this comparison does is pull them apart. A database can search brilliantly and refuse you any override at all (Snowflake), search crudely and hand you a fistful of hints anyway (MySQL until recently), or search hard and expose every lever ever machined (Oracle). Sophistication of the search and generosity of the control surface are independent axes. Knowing where a system sits on each tells you most of what its planner feels like to live with.

Slightly odd is that there’s a section of DB2 but not on SQL Server. But it is a good cross-comparison of several of the top relational database options.

Leave a Comment

Optional Parameter Plan Optimization

Hugo Kornelis concludes a mini-series:

It’s time to finish the triptych on bad parameter sniffing, and how Microsoft tries (and fails) to fix this for you.

After first talking about bad parameter sniffing in general, I used my last video to explain Parameter Sensitive Plan Optimization, the feature that Microsoft released in SQL Server 2022 as an attempt to fix one of the three root causes for bad parameter sniffing. If you want to see why I consider that a failed attempt, go watch that video.

In SQL Server 2025, Microsoft then added Optional Parameter Plan Optimization (OPPO). Their attempt to fix the second of those three root causes. Put where PSPO in most cases simply doesn’t do anything, or does do something, but not enough, OPPO does too much. It fixes bad parameter sniffing due to optional parameters in demos that are carefully curated to showcase the feature, but it does more bad than good in more realistic scenarios.

Read on for Hugo’s overview, as well as a video.

Leave a Comment

The Limits of Parallelism with Fabric Dataflows Gen2

Chris Webb notes that parallelism does not mean “free performance improvements”:

To finish off my series of posts on concurrent evaluation in Fabric Dataflows Gen2 (see part 1 and part 2) I decided to do some more realistic tests to see how much parallelism I could get. To do this I uploaded 244 identical Excel files containing almost 542000 rows of data each to a SharePoint document library. Excel files are probably the worst-performing file format for dataflows (see here for some tests that show this), while SharePoint is probably the worst-performing place to store data for a dataflow and also has a reputation for throttling applications that make too many requests.

Click through for Chris’s test results.

Leave a Comment

Partitioned Compute with Fabric Dataflows Gen2

Chris Webb continues to test out Fabric Dataflow Gen2 performance:

In the first part of this series I showed how the Concurrency setting in a Fabric Dataflows Gen2 can affect refresh performance when there are multiple queries inside the dataflow. In this post I will show how, with Partitioned Compute, this setting can also affect the performance of a single query within a dataflow.

To test this I created a dataflow with one query, a modified version of the query that I used in this post from earlier this year which returns a table with ten rows and calls a function with a built-in delay of 60 seconds on each row.

This is a preview feature but Chris shows a simple but effective test to demonstrate how this capability works.

Comments closed

Filtered Indexes in SQL Server

Erik Darling has a new video:

Now, you just can’t talk about indexing in SQL Server really without talking about filtered indexes. They are a very, very important thing. Conceptually, they are just not that hard to figure out.

It’s an index with a where clause. It only indexes some of the data. It qualifies for the where clause. I don’t know. Like the benefits of that just seem rather apparent to me.

Benjamin Franklin highly encourages you to watch this video, even though filtered indexes are one of the most frustrating things in SQL Server. There are so many cases where I think they should work, and they actually work in approximately a third of those cases.

Comments closed

Polymorphic Associations in Postgres

Andrei Lepikhov has multiple types:

Planning such a query efficiently is no easy task — and in my experience, this is confirmed by user reports from the 1C world, since PostgreSQL is currently not rich in LEFT JOIN optimisations. At the same time, the properties of this pattern enable the development of various techniques to improve execution efficiency. I’ve managed to implement several straightforward optimisations of this template. But first, let’s understand what polymorphic references actually are, where they come from, and how common they really are. That’s the gap I’m trying to fill with this post.

Click through for the explanation. This isn’t the easiest problem to solve in the relational world, though I do tend to prefer the subclass/superclass solution, myself.

Comments closed

CLUSTER BY in Microsoft Fabric Data Warehouse

Nikola Ilic shows off a relatively new feature:

The first thing every Fabric architect reaches for in this situation is the usual playlist: check the query plan, look at the joins, validate the statistics, maybe scale up the capacity. All worth doing, but none of those things addressed what was actually happening: the warehouse was scanning the entire table for every filtered query, because there was no way to tell it which Parquet files actually contained the rows we cared about.

However, Microsoft shipped data clustering in preview at the end of November 2025, and the entire conversation changed.

In this article, I want to walk you through what data clustering is, how it works under the hood, and most importantly, I’ll show you a real demo on a 100-million-row clickstream table that you can run in your own warehouse. No abstractions, no marketing numbers, but actual T-SQL you can paste.

Some of the notes Nikola mentions remind me of some of the rules around making columnstore indexes work and for much of the same reason. But as Nikola’s demo shows, this is definitely a “You must be this tall to ride the ride” feature, and unless you’re talking about quite large fact tables with (at a minimum) billions of rows of data, the benefit mostly comes from reducing CUs rather than wall clock time improvements.

Comments closed

Partitioning and Columnstore Indexes

Erik Darling puts together a great combination for a very large dataset:

 So, today we’re going to talk about partitioning in columnstore because there are important differences between partitioned columnstore tables and partitioned rowstore tables. One of the sort of superpowers that columnstore has is the ability to use metadata about which row groups have which data in them, and it can skip entire segments that do not contain relevant data.

I agree with Erik’s point that you do need around 500 million or so rows before this capability really shines, but if you do pick the right partition key, you get one of those rare and coveted performance improvements from partitioning.

Comments closed