Press "Enter" to skip to content

Category: Query Tuning

Getting An Accurate Query Execution Time

Grant Fritchey shares some tips on accurate query time estimation:

Before we get into all the choices and compare them, let’s baseline on methodology and a query to use.

Not sure why, but many people give me blow back when I say “on average, this query runs in X amount of time.” The feedback goes “You can’t say that. What if it was just blocking or resources or…” I get it. Run a query one time, change something, run that query again, declare the problem solved, is not what I’m suggesting. Notice the key word and trick phrase “on average.” I don’t run the query once. I run it several times, capture them all, then get the average of the durations.

The observer effect is in full force with a couple of the techniques Grant shows, but the rest are generally stable, which is a good thing.

Comments closed

Ways To Check For Non-Existence

Brent Ozar shows two methods for finding records missing associated child records:

You’re writing a query, and you wanna check to see if rows exist in a table.

I’m using the free Stack Overflow database, and I wanna find all of the users who have not left a comment. The tables involved are:

  • In dbo.Users, the Id field uniquely identifies a user.
  • In dbo.Comments, there’s a UserId field that links to who left the comment.

A quick way to write it is:

And this works fine. When you read the query, you might think SQL Server would run that SELECT * FROM dbo.Comments query for every single row of the Users table – but it’s way smarter than that, bucko. It scans the Comments index first because it’s much larger, and then joins that to the Users table.

But this isn’t the only way to query these tables, and Brent shows how to tell which method works better.

Comments closed

Grouping And Aggregating: Optimizing The Optimizer

Itzik Ben-Gan shows an example of how you can nudge the SQL Server optimizer to the right answer by rewriting a query:

As you can see, the groups are obtained by scanning the index on the groups table, and the aggregate is obtained by applying a seek in the index on the main table. The higher the density of the grouping set, the more optimal this plan is compared to the default strategy for the grouped query.

Just like we did earlier for the default scan strategy, let’s estimate the number of logical reads and plan cost for the seeks strategy. The estimated number of logical reads is the number of reads for the single execution of the Index Scan operator that retrieves the groups, plus the reads for all of the executions of the Index Seek operator.

The estimated number of logical reads for the Index Scan operator is negligible compared to the seeks; still, it’s CEILING(1e0 * @numgroups / @rowsperpage). Take Query 4 as an example; say the index idx_sid fits about 600 rows per leaf page (actual number depends on actual shipperid values since the datatype is VARCHAR(5)). With 5 groups, all rows fit in a single leaf page. If you had 5,000 groups, they would fit in 9 pages.

Plus some love for the APPLY operator.  Read the whole thing.

Comments closed

Betteridge’s Law And Index Hints

Bert Wagner asks a question in his title, Should You Use Index Hints?  Those familiar with Betteridge’s Law of Headlines know the general answer already:

One way to “fix” a poor performing plan is to use an index hint.  While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the  query optimizer to use the index specified in the hint to retrieve the data (hence, it’s really more of a “command” than a “hint”).

Sometimes when I feel like I’m losing control I like using an index hint to show SQL Server who’s boss.  I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

About the only place I consistently use index hints is with filtered indexes, where the combination of parameter sniffing and inexactitude in filters will convince the optimizer that the filtered index isn’t helpful when it really is.

Comments closed

When Using DBCC DROPCLEANBUFFERS

Dan Guzman shares words of wisdom with using DBCC DROPCLEANBUFFERS for testing query performance in SQL Server:

One can make the argument that DBCC DROPCLEANBUFFERS might not be particularly valuable for testing. First, the storage engine in SQL Server Enterprise Edition (or Developer Edition, which is often used when testing) behaves differently with a cold cache versus a warm one. With a warm cache, a page not already in cache (e.g. index seek by primary key) will be fetched from disk using a single 8K page IO request as one expects. However, when the cache isn’t fully warmed up (Buffer Manager’s Target Pages not yet met), the entire 64K extent (8 contiguous 8K pages) is read for the single page request regardless of whether the adjacent pages are actually needed by the query. This has the benefit of warming the cache much more quickly than would otherwise occur, but given that the normal steady state of a production SQL Server is a warm cache, testing with a cold cache isn’t a fair comparison of different plans. More data than normal will be transferred from storage so timings may not be indicative of actual performance.

I don’t think I agree 100% with that argument, but I am sympathetic to it.  Still, Dan has great advice in this post.

Comments closed

Non-Blocking Aggregations

Daniel Hutmacher tilts at windmills:

It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match (aggregate) or create a corrective Sort operation, both of which are blocking.

Is there anything we can do about this? Yes, sometimes, like when those computed expressions are YEAR() and MONTH(), there is. But you should probably get your nerd on for this one.

There are many ways to solve a problem, and sometimes the best method is indirect.

Comments closed

When Stream Aggregates Require Sorting

Itzik Ben-Gan continues his series on grouping and aggregation:

Let’s try and figure out the costing formula for the Sort operator. Remember, our focus is the estimated cost and scaling because our ultimate goal is to figure out optimization thresholds where the optimizer changes its choices from one strategy to another.

The I/O cost estimate seems to be fixed: 0.0112613. I get the same I/O cost irrespective of factors like number of rows, number of sort columns, data type, and so on. This is probably to account for some anticipated I/O work.

As for the CPU cost, alas, Microsoft doesn’t publicly expose the exact algorithms that they use for sorting. However, among the common algorithms used for sorting by database engines in general are different implementations of merge sort and quicksort. Thanks to efforts made by Paul White, who’s fond of looking at Windows debugger stack traces (not all of us have the stomach for this), we have a bit more insight into the topic, published in his series “Internals of the Seven SQL Server Sorts.” According to Paul’s findings, the general sort class (used in the above plan) uses merge sort (first internal, then transitioning to external). In average, this algorithm requires n log n comparisons to sort n items. With this in mind, it’s probably a safe bet as a starting point to assume that the CPU part of the operator’s cost is based on a formula such as:

Operator CPU cost = <startup cost> + @numrows * LOG(@numrows) * <comparison cost>

Of course, this could be an oversimplification of the actual costing formula that Microsoft uses, but absent any documentation on the matter, this is an initial best guess.

It’s interesting to see how the calculation changes in form with larger numbers of rows.

Comments closed

The Bitmap Operator

Hugo Kornelis explains what the Bitmap operator is in an execution plan:

As implied by its logical operation, “Create Bitmap”, the Bitmap operator creates a bitmap. (I assume that this is the only logical operation that the Bitmap operator supports, since I have never seen a Bitmap operator with a different logical operation). A bitmap is a structure that stores Boolean values for a consecutive range of values in a small amount of memory. E.g. the range from 1 to 8000 has 8000 possible values. These can be represented as 8000 bits in just 1000 bytes. For each value, the location of the corresponding bit can be computed by dividing the value by 8; the dividend is the location and the remainder determines which of the bits to use. The value of that specific bit can then be tested, or it can be set to false (zero) or true (one).

The bitmap is named Opt_Bitmap1005 in this case. This name is not exposed in the quick property popup, but you can find in in the full property sheet as shown here, in the Defined Values property. You will also note that this bitmap is not included in the Output List property. That’s because this bitmap is not created for each individual row; there is a single bitmap that accumulates information from all rows. Other operators in the execution plan can reference this bitmap by name, even though it is not passed to them in their input rows.

Hugo goes into great detail on the operator, so you’ll want to set aside some time to read this.

Comments closed