Press "Enter" to skip to content

Category: Query Tuning

Row Own-Goals

Hugo Kornelis didn’t come up with quite as good of a title:

In part 1 of this mini-series, I explained what a rowgoal is and how it works to optimize a query with a TOP or FETCH expression. Part 2 then showed a few less obvious other cases where the optimizer might introduce rowgoals. In all cases so far, those rowgoals were beneficial. They helped the optimizer come up with the best execution plan for the number of rows requested.

Click through for the video.

Leave a Comment

Table Statistics and Planning Slowdowns

Andrei Lepikhov digs into a performance issue:

A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?

This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL’s pg_statistic table.

Read on for Andrei’s analysis and some interesting thoughts on possible avenues for improvement.

Leave a Comment

Adaptive Joins and Large Memory Grants

Kendra Little re-creates a problem:

Adaptive joins let the optimizer choose between a Hash Join and a Nested Loop join at runtime, which can be fantastic for performance when row count estimates are variable. Recently, when Erik Darling taught two days on TSQL at PASS Community Data Summit, a student asked why a query plan where an adaptive join used a Nested Loop at runtime ended up with a large memory grant anyway.

I didn’t remember the answer to this, but the great thing about co-teaching is that Erik did: adaptive joins always start executing as Hash Joins, which means they have to get memory grants upfront. Even if the query ultimately switches to a Nested Loop at runtime, that memory grant was already allocated. This has real implications for memory usage, especially in high-concurrency environments.

Read on for a dive into adaptive joins, how they work, and the consequences when the database engine makes use of them.

Leave a Comment

When Wide Queries Become Slow in SQL Server

Kendra Little talks baggage:

I see this pattern repeatedly: a “wide” query that returns many columns and less than 100k rows runs slowly. SQL Server gets slow when it drags large amounts of baggage through the entire query plan, like a solo traveler struggling with massive suitcases in an airport instead of picking them up close to their destination.

SQL Server often minimizes data access by grabbing all the columns it needs early in query execution, then doing joins and filters. This means presentation columns get picked up early.

Read on to see the effects of this, as well as what you can do to mitigate the issue.

Comments closed

A PostgreSQL Query Plan that Changes without Data or Stats Changes

Frederic Yhuel troubleshoots an issue:

We recently encountered a strange optimizer behaviour, reported by one of our customers:

Customer: “Hi Dalibo, we have a query that is very slow on the first execution after a batch process, and then very fast. We initially suspected a caching effect, but then we noticed that the execution plan was different.”

Dalibo: “That’s a common issue. Autoanalyze didn’t have the opportunity to process the table after the batch job had finished, and before the first execution of the query. You should run the VACUUM ANALYZE command (or at least ANALYZE) immediately after your batch job.”

Customer: “Yes, it actually solves the problem, but… your hypothesis is wrong. We looked at pg_stat_user_tables, and are certain that the tables were not vacuumed or analyzed between the slow and fast executions. We don’t have a production problem, but we would like to understand.”

Dalibo: “That’s very surprising! we would also like to understand…”

So let’s dive in!

Read on for a description of the issue and what Frederic and team found.

Comments closed

How Rowgoals Work in SQL Server

Hugo Kornelis has a new video:

For my second vlog, I decided to talk about rowgoals. First an explanation of what they are, then an overview of some obvious and some not so obvious cases where the optimizer will use a rowgoal, and finally a warning about cases where this normally beneficial feature might hurt instead of help.

Click through for part one of a new video series.

Comments closed

Indexes and COUNT() in SQL Server

Louis Davidson does some testing:

A few weeks ago, there was a LinkedIn post (I can’t find it anymore) that covered something about how indexes were used by COUNT in SQL. I think it may have been based on SQL Server, but I am not sure (it is rare that one of the SQL posts on LinkedIn mentions a platform). At the time, I went and tried a few of the mentioned cases and realized this was an interesting question: how does the COUNT aggregate use indexes when you use various different expressions.

Louis has a series of test cases and I got most of them right, though I wasn’t sure about one particular optimization.

Comments closed

Parameter Sensitivity Training

Erik Darling finally gets HR involved. The sad and/or clever part is that I wrote that line before Erik made the joke.

This is Part 1 in a set of videos covering Erik’s talk for PASS Data Community Summit in 2025. As is usual, Erik’s videos are worth watching even though he doesn’t give me even one paragraph that I can copy and include as a graf here, meaning that I need to type in more in order to make it so that any RSS feed reader connected to Curated SQL doesn’t panic and force you to open the post because it’s too short, and thus causing me to write even longer run-on sentences than I normally would write, though I typically indulge myself in run-on sentences so the blame isn’t 100% on Erik; in short, Erik allows me to use semi-colons more often, and I appreciate it.

3 Comments

Diagnosing DirectQuery Performance Woes

Chris Webb digs into one type of DirectQuery performance slowdown:

One very common cause of Power BI performance problems is having a table with a large number of rows on your report. It’s a problem I wrote about here, and while I used an Import mode for my example in that post I also mentioned that this can be an even bigger problem in DirectQuery mode: while the DAX query for the table visual might have a TOPN filter that asks for 502 rows, the query going back to the DirectQuery data source (usually a SQL query) may not have that filter applied and could return a much larger number of rows, which could then take a long time for Power BI to read. I wrote about this in more detail here and showed how you can diagnose the problem in Performance Analyzer by looking at the Execute DirectQuery event and ActualQueryDuration, RowsRead and DataReadDuration metrics. But now I have a custom visual to display Performance Analyzer export data, what does this look like? Also, what can Execution Metrics tell us?

Read on to learn more.

Comments closed