Press "Enter" to skip to content

Category: Query Tuning

Helper Predicates And Multi-Column Filters

Rob Farley has an interesting post on optimizing a lookup when you have separate date and time columns:

Here we see a Seek Predicate that looks for OrderDate values between two values that have been worked out elsewhere in the plan, but creating a range in which the right values must exist. This isn’t >= 20110805 00:00 and < 20110806 00:00 (which is what I would’ve made it), it’s something else. The value for start of this range must be smaller than 20110805 00:00, because it’s >, not >=. All we can really say is that when someone within Microsoft implemented how the QO should respond to this kind of predicate, they gave it enough information to come up with what I call a “helper predicate.”

Now, I would love Microsoft to make more functions sargable, but that particular request was Closed long before they retired Connect.

But maybe what I mean is for them to make more helper predicates.

The problem with helper predicates is that they almost certainly read more rows than you want. But it’s still way better than looking through the whole index.

Read the whole thing.

Comments closed

Performance Concern: Anti-Join And Top

Paul White explains a scenario in which an innocent-looking execution plan can hide something sinister:

Not every execution plan containing an apply anti join with a Top (1) operator on its inner side will be problematic. Nevertheless, it is a pattern to recognise and one which almost always requires further investigation.

The four main elements to look out for are:

  • A correlated nested loops (apply) anti join

  • Top (1) operator immediately on the inner side

  • A significant number of rows on the outer input (so the inner side will be run many times)

  • potentially expensive subtree below the Top

Read the whole thing.  This is a great way to wrap up the series.

Comments closed

Character Columns And MAX Vs TOP+ORDER Differences

Kendra Little digs into a tricky performance problem:

Most of the time in SQL Server, the MAX() function and a TOP(1) ORDER BY DESC will behave very similarly.

If you give them a rowstore index leading on the column in question, they’re generally smart enough to go to the correct end of the index, and — BOOP! — just pluck out the data you need without doing a big scan.

I got an email recently about a case when SQL Server was not smart enough to do this with MAX() — but it was doing just fine with a TOP(1) ORDER BY DESC combo.

The question was: what’s the problem with this MAX?

It took me a while to figure it out, but I finally got to the bottom of the case of the slow MAX.

Great story and very interesting sleuthing work on Kendra’s part.

Comments closed

Row Goals And Anti-Joins

Paul White continues his row goals series:

The optimizer assumes that people write a semi join (indirectly e.g. using EXISTS) with the expectation that the row being searched for will be found. An apply semi join row goal is set by the optimizer to help find that expected matching row quickly.

For anti join (expressed e.g. using NOT EXISTS) the optimizer’s assumption is that a matching row will not be found. An apply anti join row goal is not set by the optimizer, because it expects to have to check all rows to confirm there is no match.

If there does turn out to be a matching row, the apply anti join might take longer to locate this row than it would if a row goal had been used. Nevertheless, the anti join will still terminate its search as soon as the (unexpected) match is encountered.

Another very interesting part of the series and well worth the time to read.

Comments closed

Row Goals On Anti-Joins

Paul White continues his row goals series:

The optimizer assumes that people write a semi join (indirectly e.g. using EXISTS) with the expectation that the row being searched for will be found. An apply semi join row goal is set by the optimizer to help find that expected matching row quickly.

For anti join (expressed e.g. using NOT EXISTS) the optimizer’s assumption is that a matching row will not be found. An apply anti join row goal is not set by the optimizer, because it expects to have to check all rows to confirm there is no match.

If there does turn out to be a matching row, the apply anti join might take longer to locate this row than it would if a row goal had been used. Nevertheless, the anti join will still terminate its search as soon as the (unexpected) match is encountered.

This is a shorter article but very useful in understanding row goals, along with the rest of his series.

Comments closed

When UNION ALL Can Beat OR

Bert Wagner compares a couple methods for writing a query:

Suddenly those key-lookups become too expensive for SQL Server and the query optimizer thinks it’ll be faster to just scan the entire clustered index.

In general this makes sense; SQL Server tries to pick plans that are good enough in most scenarios, and in general I think it chooses wisely.

However, sometimes SQL Server doesn’t pick great plans. Sometimes the plans it picks are downright terrible.

If that particular topic is interesting, I’ve a blog post from a few years back on a similar vein.

Comments closed

Automatic Tuning In SQL Server 2017

Arun Sirpal shows off one of the more interesting features in SQL Server 2017:

Before we begin any further let’s do a little recap. Automatic tuning in SQL Server 2017 notifies you whenever a potential performance issue is detected, and lets you apply corrective actions, or lets the Database Engine automatically fix performance problems, this is also available in Azure SQL Database.

There are 2 parts to it, automatic plan correction and automatic index management, for SQL Server 2017, automatic index management it IS NOT part of the product.

To switch automatic plan correction on you will need to run the following code against your database.

I’m looking forward to seeing this expand much further.

Comments closed

Units Of Measure In The ShowPlan Schema

Grant Fritchey shows off the ShowPlan Schema:

Because the showplan schema contains notes throughout stating what the units of measure are, what each of the values means. For instance, I can explain why SerialDesiredMemory, DesiredMemory, RequestedMemory are identical:

…Provide memory grant estimate as well as actual runtime memory grant information. Serial required/desired memory attributes are estimated during query compile time for serial execution. The rest of attributes provide estimates and counters for query execution time considering actual degree of parallelism. SerialRequiredMemory: Required memory in KB if the query runs in serial mode. The query will not start without this memory. SerialDesiredMemory: Memory estimated to fit intermediate results in KB if the query runs in serial mode. RequiredMemory: Required memory in KB for the chosen degree of parallelism. If the query runs in serial mode, this is the same as SerialRequiredMemory. …

That’s taken directly from the 2017 schema. The units of measure are KB.

I’d never seen this before, so that’s going on my to-read list.

Comments closed

When AT TIME ZONE Is Too Slow

Robert Davis troubleshoots a performance problem relating to time zones:

Time Zones were definitely being a drag today. I got an email from one of the developers at work asking about the performance difference between 2 queries. The only difference between the 2 queries is that one of them uses the AT TIME ZONE clause that was added in SQL Server 2016. I have not played around with this particular clause, but we do store quite a bit of data in the datetimeoffset data type. In the table in the developer’s queries, the data is all stored in the Eastern time zone, but they are considering storing it in additional time zones and will want to be able to display it in the Eastern time zone even if not stored that way. Thus, AT TIME ZONE.

When the developer was testing the conversion function, he noticed that the query slowed waaaayyyyy down when he added AT TIME ZONE. Before adding AT TIME ZONE to the query, STATISTICS TIME for the query was: CPU time: 145549 ms, elapsed time: 21693 ms.. It returned 8,996 rows, but if I removed the DISTINCT, it returned over 72M rows. That’s a lot of clams … er, data.

Read on for the rest of the story, including Robert’s solution.  Also check out his Connect item related to this.

Comments closed

When Join Order Matters

Bert Wagner takes a look at one of the lesser appreciated tricks in performance tuning:

I had a great question submitted to me (thank you Brandman!) that I thought would make for a good blog post:

…I’ve been wondering if it really matters from a performance standpoint where I start my queries. For example, if I join from A-B-C, would I be better off starting at table B and then going to A & C?

The short answer: Yes. And no.

One of my favorite query tuning books is SQL Tuning by Dan Tow.  Parts of it are rather dated at this point—like pretty much anything involving a rule-based optimizer—but the gist still works well.  What it comes down to is finding the best single table from which to drive your query (based on table size, filters, etc.) and selecting the appropriate join order afterward.  It’s fairly time-consuming effort, but for the 0.5-1% of queries which really need it, it can be the difference between an awful plan and a good one.

Comments closed