Category: Query Tuning

TOP and Ordering

Published 2020-08-12 by Kevin Feasel

Erik Darling is in the middle of a back-to-basics series on performance tuning:

And you see, once you set up a query to return the TOP N rows, there’s an expectation that users get to choose the order they start seeing rows in. As long as we stick to columns whose ordering is supported by an index, things will be pretty stable.
Once we go outside that, a TOP can be rough on a query.

Read on for an example of what happens when that type of thing goes wrong.

Comments closed

When Date Tables Go Bad

Published 2020-08-05 by Kevin Feasel

Brent Ozar walks through a scenario in which a calendar table (AKA, date dimension) makes a query perform quite a bit worse:

So why did the date table not perform as well as the old-school way?
SQL Server doesn’t understand the relationship between these two tables. It simply doesn’t know that all of the rows in the Users table will match up with rows in the calendar table. It assumes that the calendar table is doing some kind of filtering, especially given our CAST on the date. It doesn’t expect all of the rows to match.

My reaction was pretty much the same as Koen Verbeeck’s in the comments. Put in clearer terms, calendar tables work best when you’re joining a DATE type to a DATE type. Once you introduce times into the mix, the optimizer has to behave differently, not least because you have to do things like CAST() to coerce data types.

Comments closed

Benefits from Nonclustered Columnstore Indexes

Published 2020-08-05 by Kevin Feasel

Dave Mason shows off some places where non-clustered columnstore indexes can benefit you:

I tend to work mostly with OLTP environments. Many of them have questionable designs or serve reporting workloads. Not surprisingly, there are a lot of performance-sapping table scans and index scans. I’ve compensated for this somewhat by using row and page compression, which became available on all editions of SQL Server starting with SQL Server 2016 SP1. Could I get even better results with columnstore indexes? Lets look at one example.
Here are four individual query statements from a stored procedure used to get data for a dashboard. If you add up percentages for Estimated Cost (CPU + IO), Estimated CPU Cost, or Estimated IO Cost, you get a total of about 90% (give or take a few percent).

Read on for the queries and to see how adding a non-clustered columnstore index helped in Dave’s case. I haven’t had a great deal of success with non-clustered columnstore indexes, but have greatly enjoyed the use of clustered columnstore indexes for fact tables.

Comments closed

Aggregate Splitting in SQL Server 2019

Published 2020-08-04 by Kevin Feasel

Paul White takes us through a new trick the optimizer has learned:

The extended event query_optimizer_batch_mode_agg_split is provided to track when this new optimization is considered. The description of this event is:
Occurs when the query optimizer detects batch mode aggregation is likely to spill and tries to split it into multiple smaller aggregations.
Other than that, this new feature hasn’t been documented yet. This article is intended to help fill that gap.

Read on as Paul fills that gap.

Comments closed

Halloween Problem and Inserts

Published 2020-07-30 by Kevin Feasel

Jared Poche continues a dive into the Halloween Problem:

I would have expected us to scan the temp table, then have a LEFT JOIN to the base table. The Table Spool is the red flag that we have an issue with the plan, and is frequently seen with Halloween protections.
The index scan on the base table seems to be overkill since we’re joining on the primary key columns (the key lookup isn’t much of a concern). But we’re likely doing the scan because of the spool; it’s SQL Server’s way of getting all relevant records in one place at one time, breaking the normal flow of row mode operation, to make sure we don’t look up the same record multiple times.

Read on to see the execution plan as well as Jared’s fix.

Comments closed

Seeks are Better than Scans, Except when they Aren’t

Published 2020-07-28 by Kevin Feasel

Hugo Kornelis explains that both seeks and scans exist for a good reason:

Fact: You should never blindly trust anything you find on the internet. And right now, you are reading the internet. So why should you trust this?
You shouldn’t. At least, not blindly. You should verify. And what better way to verify then through demos!

There went my strategy of blindly trusting Hugo.

The rule of thumb I have heard and go by is, if you’re retrieving less than 1/2 of 1% of data, a seek is the best route. If you’re returning more than 20% of data, a scan is the best route. In between is the “it depends” zone, where either could potentially be better. But please do read Hugo’s post—it’s an important one for query tuners.

Comments closed

Getting the Last Query Plan Stats in SQL Server 2019

Published 2020-07-20 by Kevin Feasel

John Morehouse walks us through retrieving the actual query plan stats of the last run of an execution plan:

Currently, if you are not on SQL Server 2019 and wanted to see an execution plan, you would attempt to dive into the execution plan cache to retrieve an estimated plan. Keep in mind that this just an estimated plan and the actual plan, while the shape should be the same, will have runtime metrics. These Actual runtime metrics could be, different than what is shown in the estimated plan, so it is important to get the actual whenever possible.
With the introduction of lightweight statistics, SQL Server can retain the metrics of the actual execution plan if enabled. Note that this could introduce a slight increase in overhead, however, I haven’t yet seen it be determinantal. These metrics are vastly important when performing query tuning.

Read on to see the specific set of metrics you can pull and how to do it. This does require SQL Server 2019.

Comments closed

When Batch Mode on Rowstore Hurts Performance

Published 2020-07-16 by Kevin Feasel

Erik Darling walks us through a scenario where batch mode on rowstore can make performance of a query worse:

I’m not mad at 2019 or Batch Mode On Rowstore (BMOR) or anything.
But if I’m gonna get into it, I’m gonna document issues I run into so that hopefully they help you out, too.
One thing I ran into recently was where BMOR kicked in for a query and made it slow down.

Click through for the scenario, why it’s slower when using batch mode, and two ways you can improve the query.

Comments closed

Derived Table Nesting and Performance

Published 2020-07-08 by Kevin Feasel

Itzik Ben-Gan digs into some of the performance considerations around nested derived tables:

Unnesting/substitution of table expressions is a process of taking a query that involves nesting of table expressions, and as if substituting it with a query where the nested logic is eliminated. I should stress that in practice, there’s no actual process in which SQL Server converts the original query string with the nested logic to a new query string without the nesting. What actually happens is that the query parsing process produces an initial tree of logical operators closely reflecting the original query. Then, SQL Server applies transformations to this query tree, eliminating some of the unnecessary steps, collapsing multiple steps into fewer steps, and moving operators around. In its transformations, as long as certain conditions are met, SQL Server can shift things around across what were originally table expression boundaries—sometimes effectively as if eliminating the nested units. All of this in attempt to find an optimal plan.
In this article I cover both cases where such unnesting takes place, as well as unnesting inhibitors. That is, when you use certain query elements it prevent SQL Server from being able to move logical operators in the query tree, forcing it to process the operators based on the boundaries of the table expressions used in the original query.

That’s on my list for a second reading.

Comments closed

Local Variables with TOP and ORDER BY

Published 2020-07-08 by Kevin Feasel

Erik Darling points out issues with using local variables. First up is with TOP:

In case you missed it for some reason, check out this post of mine about local variables. Though it’s hard to imagine how you missed it, since it’s the single most important blog post ever written, even outside of SQL Server. It might even be more important than SQL Server. Time will tell.
While live streaming recently about paging queries, I thought that it might make an interesting post to see what happens when you use variables in places other than the where clause.
After several seconds of thinking about it, I decided that TOP would be a good enough place to muck around.

After that is ORDER BY:

I see this kind of pattern a lot in paging queries where people are doing everything in their power to avoid writing dynamic SQL for some reason.
It’s almost as if an entire internet work of SQL Server knowledge and advice doesn’t exist when they’re writing these queries.
Quite something. Quite something indeed.

I’d call out Erik’s ORDER BY examples by saying “C’mon, nobody does that!” if I hadn’t actually seen people do that…

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31