Press "Enter" to skip to content

Category: Query Tuning

Binary Search for Chronological Records in SQL Server

Andy Brownsword performs several probes:

Specifically we’ll use a binary search approach to narrow the search range. We abuse the correlation between the clustering key and timestamp to zero in on the records, using the key for navigation, and the timestamp to guide us.

We’ll start with the first and last records as boundaries, followed by checking the timestamp at the mid-point. Depending on whether the timestamp is before or after our target point in time, the appropriate boundary is moved. This halves the key space, and the search repeats until we’ve narrowed the range sufficiently to scan a very small portion of records.

It’s a neat idea, though do watch for Andy’s warning at the end.

Leave a Comment

Making Row-Level Security Faster

Brent Ozar speeds up some operations:

The official Azure SQL Dev’s Corner blog recently wrote about how to enable soft deletes in Azure SQL using row-level security, and it’s a nice, clean, short tutorial. I like posts like that because the feature is pretty cool and accomplishes a real business goal. It’s always tough deciding where to draw the line on how much to include in a blog post, so I forgive them for not including one vital caveat with this feature.

Click through for that caveat, as well as how you can mitigate its performance impact.

Leave a Comment

Testing Implicit Conversion and Performance in SQL Server

Louis Davidson runs some tests:

If you have ever done any performance tuning of queries in SQL Server, no doubt one of the first thing you have heard is that your search argument data types need to match the columns that you are querying. Not one thing in this blog is going to dispute that. Again, the BEST case is that if your column is an nvarchar, your search string matches that column datatype. But why is this? I will do my best to make this pretty clear, especially why it doesn’t always matter.

Read on as Louis lays out the explanation.

Leave a Comment

Performance Studio

Erik Darling has a new free tool:

Stop clicking through SSMS execution plans like it’s 2005.
Performance Studio is a free, open-source plan analyzer that tells you what’s wrong,
where it’s wrong, and how bad it is — from the command line, a desktop GUI,
an SSMS extension, or an AI assistant.

Built by someone who has stared at more execution plans than any reasonable person should.

Click through for some of its capabilities, as well as how to get your hands on a copy.

Leave a Comment

Query Tuning and Premature Optimization

Denny Cherry shares some advice:

This runs about as inconsistently as you would expect, given that it’s the same plan every time, no matter what values are being passed in. Getting this to perform better and consistanly requires some dynamic SQL changes that look similar to the following.

Denny’s scenario is a very common one: as developers, we don’t know which access paths users will take, so we try to develop generic solutions that can cover a wide variety of scenarios. In practice, users land on a certain set of access patterns, and now we have actual queries we can ensure work as well as possible. Except for the parts where we painted ourselves into a corner with the original generic design. But hey, that’s what the imagined rebuild that will never happen can solve.

Leave a Comment

JSONB Data in Postgres and Performance Due to TOAST

Paul Ramsey lays out the facts and the data:

Working with APIs and arrays in the jsonb type has become increasingly popular recently, and storing pieces of application data using jsonb has become a common design pattern.

But why shred a JSON object into rows and columns and then rehydrate it later to send it back to the client?

The answer is efficiency. Postgres is most efficient when working with rows and columns, and hiding data structure inside JSON makes it difficult for the engine to go as fast as it might.

Read on to learn how Postgres manages to store arbitrary-sized JSONB data within the limitations of 8KB pages, and the performance implications of doing so.

Comments closed

Performance Tuning Dependent SQL Queries in DirectQuery Mode

Chris Webb tries a change:

As I described here, Power BI can send SQL queries in parallel in DirectQuery mode and you can see from the Timeline column there is some parallelism happening here – the last two SQL queries generated by the DAX query run at the same time – but everything has to wait for that first SQL query to complete. Why? Can this be tuned?

Click through for an example. I was thinking about how challenging it would be to improve this performance at the SQL query level and if you could build a single query that operates over all three sets of data—distinct customers, distinct customers on Mondays, distinct customers in Januaries–while still performing acceptably. I’m not sure that the variants I sketched out in my head would actually perform faster, thanks to the “distinct” requirements.

Comments closed

Read Efficiency in PostgreSQL Queries

Michael Christofides explains what’s happening under the covers:

A lot of the time in database land, our queries are I/O constrained. As such, performance work often involves reducing the number of page reads. Indexes are a prime example, but they don’t solve every issue (a couple of which we’ll now explore).

The way Postgres handles consistency while serving concurrent queries is by maintaining multiple row versions in both the main part of a table (the “heap”) as well as in the indexes (docs). Old row versions take up space, at least until they are no longer needed, and the space can be reused. This extra space is commonly referred to as “bloat”. Below we’ll look into both heap bloat and index bloat, how they can affect query performance, and what you can do to both prevent and respond to issues.

Read on for a detailed explanation.

Comments closed

Row Own-Goals

Hugo Kornelis didn’t come up with quite as good of a title:

In part 1 of this mini-series, I explained what a rowgoal is and how it works to optimize a query with a TOP or FETCH expression. Part 2 then showed a few less obvious other cases where the optimizer might introduce rowgoals. In all cases so far, those rowgoals were beneficial. They helped the optimizer come up with the best execution plan for the number of rows requested.

Click through for the video.

Comments closed

Table Statistics and Planning Slowdowns

Andrei Lepikhov digs into a performance issue:

A query executes in just 2 milliseconds, yet its planning phase takes 500 ms. The database is reasonably sized, the query involves 9 tables, and the default_statistics_target is set to only 500. Where does this discrepancy come from?

This question was recently raised on the pgsql-performance mailing list, and the investigation revealed a somewhat surprising culprit: the column statistics stored in PostgreSQL’s pg_statistic table.

Read on for Andrei’s analysis and some interesting thoughts on possible avenues for improvement.

Comments closed