Press "Enter" to skip to content

Category: Query Tuning

Table Statistics and Planning Slowdowns

Andrei Lepikhov digs into a performance issue:

A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?

This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL’s pg_statistic table.

Read on for Andrei’s analysis and some interesting thoughts on possible avenues for improvement.

Leave a Comment

Adaptive Joins and Large Memory Grants

Kendra Little re-creates a problem:

Adaptive joins let the optimizer choose between a Hash Join and a Nested Loop join at runtime, which can be fantastic for performance when row count estimates are variable. Recently, when Erik Darling taught two days on TSQL at PASS Community Data Summit, a student asked why a query plan where an adaptive join used a Nested Loop at runtime ended up with a large memory grant anyway.

I didn’t remember the answer to this, but the great thing about co-teaching is that Erik did: adaptive joins always start executing as Hash Joins, which means they have to get memory grants upfront. Even if the query ultimately switches to a Nested Loop at runtime, that memory grant was already allocated. This has real implications for memory usage, especially in high-concurrency environments.

Read on for a dive into adaptive joins, how they work, and the consequences when the database engine makes use of them.

Leave a Comment

When Wide Queries Become Slow in SQL Server

Kendra Little talks baggage:

I see this pattern repeatedly: a “wide” query that returns many columns and less than 100k rows runs slowly. SQL Server gets slow when it drags large amounts of baggage through the entire query plan, like a solo traveler struggling with massive suitcases in an airport instead of picking them up close to their destination.

SQL Server often minimizes data access by grabbing all the columns it needs early in query execution, then doing joins and filters. This means presentation columns get picked up early.

Read on to see the effects of this, as well as what you can do to mitigate the issue.

Leave a Comment

A PostgreSQL Query Plan that Changes without Data or Stats Changes

Frederic Yhuel troubleshoots an issue:

We recently encountered a strange optimizer behaviour, reported by one of our customers:

Customer: “Hi Dalibo, we have a query that is very slow on the first execution after a batch process, and then very fast. We initially suspected a caching effect, but then we noticed that the execution plan was different.”

Dalibo: “That’s a common issue. Autoanalyze didn’t have the opportunity to process the table after the batch job had finished, and before the first execution of the query. You should run the VACUUM ANALYZE command (or at least ANALYZE) immediately after your batch job.”

Customer: “Yes, it actually solves the problem, but… your hypothesis is wrong. We looked at pg_stat_user_tables, and are certain that the tables were not vacuumed or analyzed between the slow and fast executions. We don’t have a production problem, but we would like to understand.”

Dalibo: “That’s very surprising! we would also like to understand…”

So let’s dive in!

Read on for a description of the issue and what Frederic and team found.

Leave a Comment

Indexes and COUNT() in SQL Server

Louis Davidson does some testing:

A few weeks ago, there was a LinkedIn post (I can’t find it anymore) that covered something about how indexes were used by COUNT in SQL. I think it may have been based on SQL Server, but I am not sure (it is rare that one of the SQL posts on LinkedIn mentions a platform). At the time, I went and tried a few of the mentioned cases and realized this was an interesting question: how does the COUNT aggregate use indexes when you use various different expressions.

Louis has a series of test cases and I got most of them right, though I wasn’t sure about one particular optimization.

Leave a Comment

Parameter Sensitivity Training

Erik Darling finally gets HR involved. The sad and/or clever part is that I wrote that line before Erik made the joke.

This is Part 1 in a set of videos covering Erik’s talk for PASS Data Community Summit in 2025. As is usual, Erik’s videos are worth watching even though he doesn’t give me even one paragraph that I can copy and include as a graf here, meaning that I need to type in more in order to make it so that any RSS feed reader connected to Curated SQL doesn’t panic and force you to open the post because it’s too short, and thus causing me to write even longer run-on sentences than I normally would write, though I typically indulge myself in run-on sentences so the blame isn’t 100% on Erik; in short, Erik allows me to use semi-colons more often, and I appreciate it.

3 Comments

Diagnosing DirectQuery Performance Woes

Chris Webb digs into one type of DirectQuery performance slowdown:

One very common cause of Power BI performance problems is having a table with a large number of rows on your report. It’s a problem I wrote about here, and while I used an Import mode for my example in that post I also mentioned that this can be an even bigger problem in DirectQuery mode: while the DAX query for the table visual might have a TOPN filter that asks for 502 rows, the query going back to the DirectQuery data source (usually a SQL query) may not have that filter applied and could return a much larger number of rows, which could then take a long time for Power BI to read. I wrote about this in more detail here and showed how you can diagnose the problem in Performance Analyzer by looking at the Execute DirectQuery event and ActualQueryDuration, RowsRead and DataReadDuration metrics. But now I have a custom visual to display Performance Analyzer export data, what does this look like? Also, what can Execution Metrics tell us?

Read on to learn more.

Comments closed

Query Estimates and Tooling in Oracle

David Fitzjarrell lays out a comparison:

Depending upon which tool is used query plans can change. There are two provided by Oracle, SQL*Plus and SQL Developer, and how they treat bind variables can alter execution plans. Let’s look into that and see what may be the cause.

SQL*Plus has been around for quite a while and is a very mature product. SQL Developer also has a long history, not quire as long as SQL*Plus but is well beyond the early phases of development. Both are excellent tools, returning reliable and repeatable results, but SQL Developer may take some liberties SQL*Plus doesn’t, especially where bind variables are involved. SQL*Plus and PL/SQL allow the developer to define data types for bind variables, and will pass those values through unchanged. SQL Developer, however, appears to pass such values using a character data type regardless whether the value is a string or numeric, allowing Oracle to ‘decide’ how to optimize the query. This can produce sometimes ‘unexplained’ results with estimates and execution plans.

Read on for an example in which the choice of tool can add a considerable percentage to the expected length of the query. The tricky part here is that this doesn’t mean the query actually takes longer, but that expectations will differ.

Comments closed