Query Tuning – Page 35

One persistent idea is that tempdb is something to be avoided. Either because it was “slow” or to avoid contention.
Granted, if a query has been around long enough, these may have been valid concerns at some point. In general though, temp tables (the # kind, not the @ kind) can be quite useful when query tuning.

Erik is absolutely right in this post. Ceteris paribus I’d rather not directly use tempdb because I’d prefer one query over multiple queries. But once performance comes into question, working on smaller subsets of data one step at a time will typically give you at least an acceptable solution.

Comments closed

PFS Contention and Heaps

Published 2021-01-05 by Kevin Feasel

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.
This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed

When Self-Joins Beat Lookups

Published 2020-12-22 by Kevin Feasel

Erik Darling shows us an interesting scenario:

Whether the lookup is Key or RID depends on if the table has a clustered index, but that’s not entirely the point.
The point is that there’s no way for the optimizer to decide to defer the lookup until later in the plan, when it might be more opportune.

Read on for the situation and the solution.

Comments closed

MAX Type Variables in WHERE Clauses and Recompile

Published 2020-12-18 by Kevin Feasel

Erik Darling puts on his lab coat and goggles:

After blogging recently (maybe?) about filters, there was a Stack Exchange question about a performance issue when a variable was declared with a max type.
After looking at it for a minute, I realized that I had never actually checked to see if a recompile hint would allow the optimizer more freedom when dealing with them.

Read on for Erik’s findings.

Comments closed

The Merge Interval Operator

Published 2020-12-15 by Kevin Feasel

Hugo Kornelis looks at another execution plan operator:

The Merge Interval operator reads dynamic seek range specifications, checks to see if their specified ranges overlap, and if so combines the overlapping ranges into one new range.
One typical use case is for a query that uses multiple BETWEEN specifications, connected with OR. When these ranges overlap, they must be combined into a single range. This saves performance, but more important is that it prevents rows that satisfy both range specifications from being returned multiple times. When the boundaries of the BETWEEN are given as constants, the optimizer analyzes for overlaps and combines ranges if needed when compiling the query. But when the boundaries of the BETWEEN specifications are only known at run-time (variables, column references), the Merge Interval operator is used for this task.

Click through to see how it works.

Comments closed

Batch Mode with Window Functions and Parallelism

Published 2020-12-11 by Kevin Feasel

Erik Darling has a two-parter on how using batch mode processing when working with window functions can lead to better performance. Part 1 sets the stage:

If you ask people who tune queries why batch mode is often much more efficient with windowing functions, they’ll tell you about the window aggregate operator.
That’s all well and good, but there’s another, often sneaky limitation of fully row mode execution plans with windowing functions.
Let’s go take a look!

Part 2 identifies the culprit:

When queries go parallel, you want them to be fast. Sometimes they are, and it’s great.
Other times they’re slow, and you end up staring helplessly at a repartition streams operator.

Check out both of these posts.

Comments closed

The Magic of TOP(100)

Published 2020-12-08 by Kevin Feasel

Daniel Hutmatcher shows an interesting line of demarcation:

But there’s a built-in (undocumented) shortcut optimization in SQL Server. If you’re returning no more than 100 rows, the internal behaviour of the Top N Sort operator changes.

Read the whole thing.

Comments closed

Batch Mode with Temp Tables

Published 2020-12-07 by Kevin Feasel

Erik Darling continues receiving big paydays from Big Temp Table:

When you have queries that need to process a lot of data, and probably do some aggregations over that lot-of-data, batch mode is usually the thing you want.
Originally introduced to accompany column store indexes, it works by allowing CPUs to apply instructions to up to 900 rows at a time.
It’s a great thing to have in your corner when you’re tuning queries that do a lot of work, especially if you find yourself dealing with pesky parallel exchanges.

Read on to see how you can create a temp table which triggers batch mode processing fairly easily.

Comments closed

Parallel Inserts into Temp Tables

Published 2020-12-02 by Kevin Feasel

Erik Darling explains the pre-conditions for parallel insertion into temporary tables:

If you have a workload that uses #temp tables to stage intermediate results, and you probably do because you’re smart, it might be worth taking advantage of being able to insert into the #temp table in parallel.
Remember that you can’t insert into @table variables in parallel, unless you’re extra sneaky. Don’t start.
If your code is already using the SELECT ... INTO #some_table pattern, you’re probably already getting parallel inserts. But if you’re following the INSERT ... SELECT ... pattern, you’re probably not, and, well, that could be holding you back.

There are enough pre-conditions that this becomes a decision rather than an automatic. Especially if you’re dealing with temp tables with indexes and want to take advantage of temp table reuse, which I believe precludes changing the structure of the table (including adding indexes) after creation.

Comments closed

The Performance Cost of AT TIME ZONE

Published 2020-11-27 by Kevin Feasel

Erik Darling shows that AT TIME ZONE does not scale well when used in filters against columns:

Databases really do make you pay dearly for mistakes, and new linguistic functionality is not implemented with performance in mind.
I’ve written before about how to approach date math in where clauses: Where To Do Date Math In Your Where Clause
And it turns out that this lesson is brutally true if you need to pass time zones around, too.

Read the whole thing. In this respect, AT TIME ZONE is similar to pretty much all other date operators and functions.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: Query Tuning

Don’t Fear the tempdb

PFS Contention and Heaps

When Self-Joins Beat Lookups

MAX Type Variables in WHERE Clauses and Recompile

The Merge Interval Operator

Batch Mode with Window Functions and Parallelism

The Magic of TOP(100)

Batch Mode with Temp Tables

Parallel Inserts into Temp Tables

The Performance Cost of AT TIME ZONE