Query Tuning – Page 35

The Magic of TOP(100)

Published 2020-12-08 by Kevin Feasel

Daniel Hutmatcher shows an interesting line of demarcation:

But there’s a built-in (undocumented) shortcut optimization in SQL Server. If you’re returning no more than 100 rows, the internal behaviour of the Top N Sort operator changes.

Read the whole thing.

Comments closed

Batch Mode with Temp Tables

Published 2020-12-07 by Kevin Feasel

Erik Darling continues receiving big paydays from Big Temp Table:

When you have queries that need to process a lot of data, and probably do some aggregations over that lot-of-data, batch mode is usually the thing you want.
Originally introduced to accompany column store indexes, it works by allowing CPUs to apply instructions to up to 900 rows at a time.
It’s a great thing to have in your corner when you’re tuning queries that do a lot of work, especially if you find yourself dealing with pesky parallel exchanges.

Read on to see how you can create a temp table which triggers batch mode processing fairly easily.

Comments closed

Parallel Inserts into Temp Tables

Published 2020-12-02 by Kevin Feasel

Erik Darling explains the pre-conditions for parallel insertion into temporary tables:

If you have a workload that uses #temp tables to stage intermediate results, and you probably do because you’re smart, it might be worth taking advantage of being able to insert into the #temp table in parallel.
Remember that you can’t insert into @table variables in parallel, unless you’re extra sneaky. Don’t start.
If your code is already using the SELECT ... INTO #some_table pattern, you’re probably already getting parallel inserts. But if you’re following the INSERT ... SELECT ... pattern, you’re probably not, and, well, that could be holding you back.

There are enough pre-conditions that this becomes a decision rather than an automatic. Especially if you’re dealing with temp tables with indexes and want to take advantage of temp table reuse, which I believe precludes changing the structure of the table (including adding indexes) after creation.

Comments closed

The Performance Cost of AT TIME ZONE

Published 2020-11-27 by Kevin Feasel

Erik Darling shows that AT TIME ZONE does not scale well when used in filters against columns:

Databases really do make you pay dearly for mistakes, and new linguistic functionality is not implemented with performance in mind.
I’ve written before about how to approach date math in where clauses: Where To Do Date Math In Your Where Clause
And it turns out that this lesson is brutally true if you need to pass time zones around, too.

Read the whole thing. In this respect, AT TIME ZONE is similar to pretty much all other date operators and functions.

Comments closed

Identifying Straggler Tasks in Spark Applications

Published 2020-11-25 by Kevin Feasel

Ajay Gupta clues us in on a process:

What Is a Straggler in a Spark Application?
A straggler refers to a very very slow executing Task belonging to a particular stage of a Spark application (Every stage in Spark is composed of one or more Tasks, each one computing a single partition out of the total partitions designated for the stage). A straggler Task takes an exceptionally high time for completion as compared to the median or average time taken by other tasks belonging to the same stage. There could be multiple stragglers in a Spark Job being present either in the same stage or across multiple stages.

Read on to understand the consequences and causes of these straggler tasks, as well as what you can do about them.

Comments closed

Making Life Harder for Filter Operators

Published 2020-11-24 by Kevin Feasel

Erik Darling has a how-not-to guide for us:

We looked at a couple examples of when SQL Server might need to filter out rows later in the plan than we’d like, and why that can cause performance issues.
Now it’s time to look at a few more examples, because a lot of people find them surprising.

Read on for several examples. Also, ~~because the bribes came through~~ I don’t mind shilling for Erik, check out 25 hours of recorded content for $100. I think Erik’s knowledge is worth at least $4 an hour. Maybe even $5.

Comments closed

Filter Operators in a Query

Published 2020-11-20 by Kevin Feasel

Erik Darling shares some introductory thoughts on the filter operator:

When we write queries that need to filter data, we tend to want to have that filtering happen as far over to the right in a query plan as possible. Ideally, data is filtered when we access the index.
Whether it’s a seek or a scan, or if it has a residual predicate, and if that’s all appropriate isn’t really the question.
In general, those outcomes are preferable to what happens when SQL Server is unable to do any of them for various reasons. The further over to the right in a query plan we can reduce the number of rows we need to contend with, the better.

Read on to see where Erik takes this and stay tuned for part 2.

Comments closed

How Scalar UDFs Slow Down Queries

Published 2020-11-19 by Kevin Feasel

Brent Ozar takes us through one big problem around scalar user-defined queries:

When your query has a scalar user-defined function in it, SQL Server may not parallelize it and may hide the work that it’s doing in your execution plan.
To show it, I’ll run a simple query against the Users table in the Stack Overflow database.

Read on for the demo, as well as what you can do about it.

Comments closed

Visualizing Analysis Services Tasks with the Job Graph

Published 2020-11-16 by Kevin Feasel

Chris Webb is excited:

More details about it, and how it can be used, are in the samples here:
https://github.com/microsoft/Analysis-Services/tree/master/ASJobGraphEvents
The data returned by the Job Graph event isn’t intelligible if you look at the text it returns in Profiler. However if you save a .trc file with Job Graph event data to XML you can use the Python scripts in the GitHub repo to generate DGML diagrams that can be viewed in Visual Studio, plus Gantt charts embedded in HTML. Of course to do this you’ll need to have Python installed; you’ll also need to have Visual Studio and its DGML editor installed (see here for details).

Read on to see how it looks and Chris’s thoughts on the matter.

Comments closed

Fun with Function Rewrites

Published 2020-11-12 by Kevin Feasel

Erik Darling reminds me why I hate user-defined functions in SQL Server:

At 23 seconds, this is probably unacceptable. And this is on SQL Server 2019, too. The function inlining thing doesn’t quite help us, here.
One feature restriction is this, so we uh… Yeah.
The UDF does not contain aggregate functions being passed as parameters to a scalar UDF
But we’re probably good query tuners, and we know we can write inline functions.

Read the whole thing, as this is not always straightforward.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Query Tuning