Press "Enter" to skip to content

Category: Query Tuning

Performance Studio

Erik Darling has a new free tool:

Stop clicking through SSMS execution plans like it’s 2005.
Performance Studio is a free, open-source plan analyzer that tells you what’s wrong,
where it’s wrong, and how bad it is — from the command line, a desktop GUI,
an SSMS extension, or an AI assistant.

Built by someone who has stared at more execution plans than any reasonable person should.

Click through for some of its capabilities, as well as how to get your hands on a copy.

Comments closed

Query Tuning and Premature Optimization

Denny Cherry shares some advice:

This runs about as inconsistently as you would expect, given that it’s the same plan every time, no matter what values are being passed in. Getting this to perform better and consistanly requires some dynamic SQL changes that look similar to the following.

Denny’s scenario is a very common one: as developers, we don’t know which access paths users will take, so we try to develop generic solutions that can cover a wide variety of scenarios. In practice, users land on a certain set of access patterns, and now we have actual queries we can ensure work as well as possible. Except for the parts where we painted ourselves into a corner with the original generic design. But hey, that’s what the imagined rebuild that will never happen can solve.

Comments closed

JSONB Data in Postgres and Performance Due to TOAST

Paul Ramsey lays out the facts and the data:

Working with APIs and arrays in the jsonb type has become increasingly popular recently, and storing pieces of application data using jsonb has become a common design pattern.

But why shred a JSON object into rows and columns and then rehydrate it later to send it back to the client?

The answer is efficiency. Postgres is most efficient when working with rows and columns, and hiding data structure inside JSON makes it difficult for the engine to go as fast as it might.

Read on to learn how Postgres manages to store arbitrary-sized JSONB data within the limitations of 8KB pages, and the performance implications of doing so.

Comments closed

Performance Tuning Dependent SQL Queries in DirectQuery Mode

Chris Webb tries a change:

As I described here, Power BI can send SQL queries in parallel in DirectQuery mode and you can see from the Timeline column there is some parallelism happening here – the last two SQL queries generated by the DAX query run at the same time – but everything has to wait for that first SQL query to complete. Why? Can this be tuned?

Click through for an example. I was thinking about how challenging it would be to improve this performance at the SQL query level and if you could build a single query that operates over all three sets of data—distinct customers, distinct customers on Mondays, distinct customers in Januaries–while still performing acceptably. I’m not sure that the variants I sketched out in my head would actually perform faster, thanks to the “distinct” requirements.

Comments closed

Read Efficiency in PostgreSQL Queries

Michael Christofides explains what’s happening under the covers:

A lot of the time in database land, our queries are I/O constrained. As such, performance work often involves reducing the number of page reads. Indexes are a prime example, but they don’t solve every issue (a couple of which we’ll now explore).

The way Postgres handles consistency while serving concurrent queries is by maintaining multiple row versions in both the main part of a table (the “heap”) as well as in the indexes (docs). Old row versions take up space, at least until they are no longer needed, and the space can be reused. This extra space is commonly referred to as “bloat”. Below we’ll look into both heap bloat and index bloat, how they can affect query performance, and what you can do to both prevent and respond to issues.

Read on for a detailed explanation.

Comments closed

Row Own-Goals

Hugo Kornelis didn’t come up with quite as good of a title:

In part 1 of this mini-series, I explained what a rowgoal is and how it works to optimize a query with a TOP or FETCH expression. Part 2 then showed a few less obvious other cases where the optimizer might introduce rowgoals. In all cases so far, those rowgoals were beneficial. They helped the optimizer come up with the best execution plan for the number of rows requested.

Click through for the video.

Comments closed

Table Statistics and Planning Slowdowns

Andrei Lepikhov digs into a performance issue:

A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?

This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL’s pg_statistic table.

Read on for Andrei’s analysis and some interesting thoughts on possible avenues for improvement.

Comments closed

Adaptive Joins and Large Memory Grants

Kendra Little re-creates a problem:

Adaptive joins let the optimizer choose between a Hash Join and a Nested Loop join at runtime, which can be fantastic for performance when row count estimates are variable. Recently, when Erik Darling taught two days on TSQL at PASS Community Data Summit, a student asked why a query plan where an adaptive join used a Nested Loop at runtime ended up with a large memory grant anyway.

I didn’t remember the answer to this, but the great thing about co-teaching is that Erik did: adaptive joins always start executing as Hash Joins, which means they have to get memory grants upfront. Even if the query ultimately switches to a Nested Loop at runtime, that memory grant was already allocated. This has real implications for memory usage, especially in high-concurrency environments.

Read on for a dive into adaptive joins, how they work, and the consequences when the database engine makes use of them.

Comments closed

When Wide Queries Become Slow in SQL Server

Kendra Little talks baggage:

I see this pattern repeatedly: a “wide” query that returns many columns and less than 100k rows runs slowly. SQL Server gets slow when it drags large amounts of baggage through the entire query plan, like a solo traveler struggling with massive suitcases in an airport instead of picking them up close to their destination.

SQL Server often minimizes data access by grabbing all the columns it needs early in query execution, then doing joins and filters. This means presentation columns get picked up early.

Read on to see the effects of this, as well as what you can do to mitigate the issue.

Comments closed