Query Tuning – Page 41

Fun with Function Rewrites

Published 2020-11-12 by Kevin Feasel

Erik Darling reminds me why I hate user-defined functions in SQL Server:

At 23 seconds, this is probably unacceptable. And this is on SQL Server 2019, too. The function inlining thing doesn’t quite help us, here.
One feature restriction is this, so we uh… Yeah.
The UDF does not contain aggregate functions being passed as parameters to a scalar UDF
But we’re probably good query tuners, and we know we can write inline functions.

Read the whole thing, as this is not always straightforward.

Comments closed

Adaptive Query Execution in Databricks

Published 2020-10-26 by Kevin Feasel

MaryAnn Xue and Allison Wang explain how Adaptive Query Execution works with Databricks:

One of the most important cost-based decisions made in the Spark optimizer is the selection of join strategies, which is based on the size estimation of the join relations. But since this estimation can go wrong in both directions, it can either result in a less efficient join strategy because of overestimation, or even worse, out-of-memory errors because of underestimation.
AQE offers a trouble-free solution here by switching to the faster broadcast hash join during execution time.

This is pretty similar to Adaptive Query Processing in SQL Server.

Comments closed

Optimizing Common Table Expressions

Published 2020-10-16 by Kevin Feasel

Itzik Ben-Gan continues a series on common table expressions:

If you’re wondering why not use a much simpler solution with a grouped query and a HAVING filter, it has to do with the density of the shipperid column. The Orders table has 1,000,000 orders, and the shipments of those orders were handled by five shippers, meaning that in average, each shipper handled 20% of the orders. The plan for a grouped query computing the maximum order date per shipper would scan all 1,000,000 rows, resulting in thousands of page reads. Indeed, if you highlight just the CTE’s inner query (we’ll call it Query 3) computing the maximum order date per shipper and check its execution plan, you will get the plan shown in Figure 3.

Read on for classic Itzik.

Comments closed

Finding the Most Costly Statement in a Stored Procedure

Published 2020-10-13 by Kevin Feasel

Grant Fritchey takes us through one method of figuring out what which statement you’re waiting to finish when running a stored procedure:

A lot of stored procedures have multiple statements and determining the most costly statement in a given proc is a very common task. After all, you want to focus your time and efforts on fixing the things that cause you the most pain. You simply don’t have the time to tune every single statement in every single procedure. So, identifying the most costly statement is vital.
Happily, Extended Events are here to help.

Click through to see how you can use extended events to figure this out.

Comments closed

Diving Into the Window Spool Operator

Published 2020-10-08 by Kevin Feasel

Hugo Kornelis continues a series on execution plan operators:

The Window Spool operator is one of the four spool operators that SQL Server supports. Like other spool operators, it retains a copy of data it receives and can then return those rows as often as needed. The specific functionality of the Window Spool operator allows it to replay rows within a window, as defined in a ROWS or RANGE specification of an OVER clause.

Read on to see how these work, as well as a few differences from their spool brethren.

Comments closed

Making Use of Sort Rewinds: Closest Match

Published 2020-10-08 by Kevin Feasel

Paul White follows up on an article:

In When Do SQL Server Sorts Rewind? I described how most sorts can only rewind when they contain at most one row. The exception is in-memory sorts, which can rewind at most 500 rows and 16KB of data.
These are certainly tight restrictions, but we can still make use of them on occasion.
To illustrate, I am going reuse a demo Itzik Ben-Gan provided in part one of his Closest Match series, specifically solution 2 (modified value range and indexing).

Click through for the explanation.

Comments closed

When SQL Server Sorts Can Rewind

Published 2020-10-07 by Kevin Feasel

Paul White turns back the hands of time:

Sorts use storage (memory and perhaps disk if they spill) so they do have a facility capable of storing rows between loop iterations. In particular, the sorted output can, in principle, be replayed (rewound).
Still, the short answer to the title question, “Do Sorts Rewind?” is:
Yes, but you won’t see it very often.

Read the whole thing.

Comments closed

Understanding MERGE Execution Plans

Published 2020-10-01 by Kevin Feasel

Hugo Kornelis walks us through the most interesting operator:

But first a word of warning. The MERGE statement, introduced in SQL Server 2008 as an easier alternative for “delete / update / insert” logic, turned out to have issues when it was released. And now, in 2020, many of those issues still exist. So I’ll just point you to Aaron Bertrand’s excellent overview, and leave you with the recommendation to be extremely wary before using MERGE in production code.
But here, we are not going to use MERGE in production. We are merely going to set up a simple test and look at how the elements in the execution plan cooperate to produce the expected results. This is interesting even if you never use MERGE, because many of the details explained below can also occur in other execution plans.

Read the whole thing, even if you avoid MERGE like the plague.

Comments closed

Capturing Execution Plans for Long-Running Queries

Published 2020-09-29 by Kevin Feasel

Grant Fritchey answers a question:

I love questions. Most of all, I love questions I can answer. I spotted this question recently: How can I use Profiler to capture execution plans for queries over a certain duration?
Oh, that’s easy. You don’t use Profiler. You use Extended Events.

Click through to see how.

Comments closed

Optimize for Unknown with Inline Table-Valued Functions

Published 2020-09-18 by Kevin Feasel

Koen Verbeeck hits on a strange case:

Turns out SQL Server used a plan with a hash join in the fast query, and a nested loop in the slow query. Due to SQL Server also wildly using incorrect estimates, the nested loops performs really poorly. Quite similar to parameter sniffing with stored procedures. Erik Darling has written a great article about it: Inline Table Valued Functions: Parameter Snorting.
The thing is, in contrast to scalar functions or multi-statement table-valued functions, the iTVF should have better performance because it will be expanded into the calling query. This way, SQL Server can use “more correct” estimates and create a plan for each different parameter. Well, today was not that day.

Read on for details on how Koen performed troubleshooting and the solution.

Comments closed

Category: Query Tuning