Press "Enter" to skip to content

Category: Performance Tuning

Share Groups and Sub-Optimal Performance

Jack Vanlightly creates a problem:

In this post we’re going to see how share.acquire.mode=record_limit combined with:

  • fewer consumers than partitions
  • and various cases of “partition skew”

…can result in subpar performance with share groups. 

I stumbled on these issues when running large sets of dimensional tests with Dimster’s explore-limits mode, which finds the highest sustainable throughput while staying within a target end-to-end latency target. There was a specific subset of the tests that explore-limits mode would consistently fail to complete, and they all happened to be with record_limit and a consumer count lower than the partition count. In this test, we’ll understand why Dimster had such a hard time with this combination.

Click through for the details, as well as how to mitigate this sort of scenario.

Leave a Comment

The Basics of Query Folding with Power BI

Andy Brownsword explains one performance improvement technique for Power BI data transformations:

As a database developer, when I started using Power BI, I was concerned about it retrieving reams of data only to perform transformations downstream. The Power Query editor misleads us into thinking the retrieval and transformations are applied sequentially.

Thanks to Query Folding, that’s not usually the case. And that gives us more power to extract performance from the database.

This only works in situations where there’s something downstream to perform that processing, like a relational database. And one of the areas where you can affect performance, either positively or negatively, is in organizing operations such that you have a stretch of foldable operations. That way, all of it can run as one operation in the database.

Leave a Comment

An Introduction to Memory Grants in SQL Server

Erik Darling has a new video for us:

Erik monitoring tool mogul darling here with Darling Data. In today’s video, much like I think I foreshadowed in yesterday’s office hours video, we are going to talk about memory grants. We’re going to do a somewhat gentle introduction to them and then in the next video we’ll talk a little bit more about where they get interesting. 

Erik shares his perfectly reasonable take on the nature of strings. I’d probably also get rid of dates as well—too much confusion there for my taste.

Leave a Comment

The Benfit of Disabling V-Order in Fabric Dataflows Gen2

Chris Webb covers a specific use case:

Quite a few new Dataflows Gen2 features were released recently without much fanfare, but that doesn’t mean they aren’t important. I will take a look at them all in my next few posts; in this first post I’ll look at the ability to disable V-Order on staged data.

As the (very detailed) documentation for this new feature describes, V-Order is a write-time optimisation for the parquet files that underpin the Delta tables that OneLake uses to store data. It slows down writing data to the tables but means that reading data from them, for example in Power BI Direct Lake mode, is much faster. 

Click through to see how disabling V-Order can make certain staging loads faster.

Leave a Comment

Comparative Query Analysis and Planning

Christophe Pettus has a two-parter. The first post covers how a half-dozen data platform technologies handle cost-based analysis:

PostgreSQL has ANALYZE. You run it (or autovacuum runs it for you), it draws a sample of 300 × default_statistics_target rows, and it writes a row per column into pg_statistic: a null fraction, an n-distinct estimate, a most-common-values list, an equi-depth histogram, and a physical-vs-logical correlation. The planner reads those numbers, multiplies selectivities together, costs a handful of join strategies, and picks one. Three join algorithms are on the menu: nested loop, merge join, hash join.

That is the entire shape of the problem, and every cost-based optimizer ever shipped solves the same one. They differ in three places, and only three: where the numbers come from, how stale the numbers are allowed to get, and which plan shapes are even legal to choose between. The algorithms are the boring part. Everybody hash-joins. The interesting part is the bookkeeping.

Then there’s how each of the systems generates a query plan:

Statistics are the input. Planning is what the database does with them: it takes a declarative query, which describes what you want and says nothing about how, and turns it into an executable plan, which is nothing but how. There are two jobs inside that. First, rewrite the query into a logically equivalent but more tractable shape, which is where subquery flattening, predicate pushdown, and view merging live. Second, search the space of physical plans (join orders, join algorithms, access paths) for the cheapest one the cost model can find. The second job is the hard one, because the number of possible join orders for a query grows faster than anyone wants to contemplate, and every database in this article is, underneath, a strategy for not enumerating all of them.

Two questions separate the six systems here. How does each one tame that search space? And once it has an answer, how much will it let you argue with the result? Those sound like the same question. They are not, and the most useful thing this comparison does is pull them apart. A database can search brilliantly and refuse you any override at all (Snowflake), search crudely and hand you a fistful of hints anyway (MySQL until recently), or search hard and expose every lever ever machined (Oracle). Sophistication of the search and generosity of the control surface are independent axes. Knowing where a system sits on each tells you most of what its planner feels like to live with.

Slightly odd is that there’s a section of DB2 but not on SQL Server. But it is a good cross-comparison of several of the top relational database options.

Comments closed

Optional Parameter Plan Optimization

Hugo Kornelis concludes a mini-series:

It’s time to finish the triptych on bad parameter sniffing, and how Microsoft tries (and fails) to fix this for you.

After first talking about bad parameter sniffing in general, I used my last video to explain Parameter Sensitive Plan Optimization, the feature that Microsoft released in SQL Server 2022 as an attempt to fix one of the three root causes for bad parameter sniffing. If you want to see why I consider that a failed attempt, go watch that video.

In SQL Server 2025, Microsoft then added Optional Parameter Plan Optimization (OPPO). Their attempt to fix the second of those three root causes. Put where PSPO in most cases simply doesn’t do anything, or does do something, but not enough, OPPO does too much. It fixes bad parameter sniffing due to optional parameters in demos that are carefully curated to showcase the feature, but it does more bad than good in more realistic scenarios.

Read on for Hugo’s overview, as well as a video.

Comments closed

The Limits of Parallelism with Fabric Dataflows Gen2

Chris Webb notes that parallelism does not mean “free performance improvements”:

To finish off my series of posts on concurrent evaluation in Fabric Dataflows Gen2 (see part 1 and part 2) I decided to do some more realistic tests to see how much parallelism I could get. To do this I uploaded 244 identical Excel files containing almost 542000 rows of data each to a SharePoint document library. Excel files are probably the worst-performing file format for dataflows (see here for some tests that show this), while SharePoint is probably the worst-performing place to store data for a dataflow and also has a reputation for throttling applications that make too many requests.

Click through for Chris’s test results.

Comments closed

Partitioned Compute with Fabric Dataflows Gen2

Chris Webb continues to test out Fabric Dataflow Gen2 performance:

In the first part of this series I showed how the Concurrency setting in a Fabric Dataflows Gen2 can affect refresh performance when there are multiple queries inside the dataflow. In this post I will show how, with Partitioned Compute, this setting can also affect the performance of a single query within a dataflow.

To test this I created a dataflow with one query, a modified version of the query that I used in this post from earlier this year which returns a table with ten rows and calls a function with a built-in delay of 60 seconds on each row.

This is a preview feature but Chris shows a simple but effective test to demonstrate how this capability works.

Comments closed

Filtered Indexes in SQL Server

Erik Darling has a new video:

Now, you just can’t talk about indexing in SQL Server really without talking about filtered indexes. They are a very, very important thing. Conceptually, they are just not that hard to figure out.

It’s an index with a where clause. It only indexes some of the data. It qualifies for the where clause. I don’t know. Like the benefits of that just seem rather apparent to me.

Benjamin Franklin highly encourages you to watch this video, even though filtered indexes are one of the most frustrating things in SQL Server. There are so many cases where I think they should work, and they actually work in approximately a third of those cases.

Comments closed

Polymorphic Associations in Postgres

Andrei Lepikhov has multiple types:

Planning such a query efficiently is no easy task — and in my experience, this is confirmed by user reports from the 1C world, since PostgreSQL is currently not rich in LEFT JOIN optimisations. At the same time, the properties of this pattern enable the development of various techniques to improve execution efficiency. I’ve managed to implement several straightforward optimisations of this template. But first, let’s understand what polymorphic references actually are, where they come from, and how common they really are. That’s the gap I’m trying to fill with this post.

Click through for the explanation. This isn’t the easiest problem to solve in the relational world, though I do tend to prefer the subclass/superclass solution, myself.

Comments closed