Query Tuning – Page 44

Speeding Up sp_helpdb

Published 2020-08-18 by Kevin Feasel

When I run this on my computer, it usually takes between 500 and 1000 MS. More on this later.
Now let’s take a look what is happening behind the scenes with sp_helpdb. The first step is to populate a temporary table, #spdbdesc with database name, owner, when it was created and compatibility level. The code for the first step is below.

Watch Dave speed this up a bit.

Comments closed

A Trillion-Row Operator

Published 2020-08-13 by Kevin Feasel

Joe Obbish sets up a challenge:

48 billion rows for a single operator is certainly a large number for most workloads. I ended up completely missing the point and started wondering how quickly a query could process a trillion rows through a single operator. Eventually that led to a SQL performance challenge: what is the fastest that you can get an actual plan with at least one operator processing a trillion rows? The following rules are in play:
1. Start with no user databases
2. Any query can run up to MAXDOP 8
3. Temp tables may be created and populated but all such work needs to finish in 30 seconds or less
4. A RECOMPILE query hint must be present in the final query
5. Undocumented features and behavior are fair game

Read on to see what Joe learned.

Comments closed

TOP and Ordering

Published 2020-08-12 by Kevin Feasel

Erik Darling is in the middle of a back-to-basics series on performance tuning:

And you see, once you set up a query to return the TOP N rows, there’s an expectation that users get to choose the order they start seeing rows in. As long as we stick to columns whose ordering is supported by an index, things will be pretty stable.
Once we go outside that, a TOP can be rough on a query.

Read on for an example of what happens when that type of thing goes wrong.

Comments closed

When Date Tables Go Bad

Published 2020-08-05 by Kevin Feasel

Brent Ozar walks through a scenario in which a calendar table (AKA, date dimension) makes a query perform quite a bit worse:

So why did the date table not perform as well as the old-school way?
SQL Server doesn’t understand the relationship between these two tables. It simply doesn’t know that all of the rows in the Users table will match up with rows in the calendar table. It assumes that the calendar table is doing some kind of filtering, especially given our CAST on the date. It doesn’t expect all of the rows to match.

My reaction was pretty much the same as Koen Verbeeck’s in the comments. Put in clearer terms, calendar tables work best when you’re joining a DATE type to a DATE type. Once you introduce times into the mix, the optimizer has to behave differently, not least because you have to do things like CAST() to coerce data types.

Comments closed

Benefits from Nonclustered Columnstore Indexes

Published 2020-08-05 by Kevin Feasel

Dave Mason shows off some places where non-clustered columnstore indexes can benefit you:

I tend to work mostly with OLTP environments. Many of them have questionable designs or serve reporting workloads. Not surprisingly, there are a lot of performance-sapping table scans and index scans. I’ve compensated for this somewhat by using row and page compression, which became available on all editions of SQL Server starting with SQL Server 2016 SP1. Could I get even better results with columnstore indexes? Lets look at one example.
Here are four individual query statements from a stored procedure used to get data for a dashboard. If you add up percentages for Estimated Cost (CPU + IO), Estimated CPU Cost, or Estimated IO Cost, you get a total of about 90% (give or take a few percent).

Read on for the queries and to see how adding a non-clustered columnstore index helped in Dave’s case. I haven’t had a great deal of success with non-clustered columnstore indexes, but have greatly enjoyed the use of clustered columnstore indexes for fact tables.

Comments closed

Aggregate Splitting in SQL Server 2019

Published 2020-08-04 by Kevin Feasel

Paul White takes us through a new trick the optimizer has learned:

The extended event query_optimizer_batch_mode_agg_split is provided to track when this new optimization is considered. The description of this event is:
Occurs when the query optimizer detects batch mode aggregation is likely to spill and tries to split it into multiple smaller aggregations.
Other than that, this new feature hasn’t been documented yet. This article is intended to help fill that gap.

Read on as Paul fills that gap.

Comments closed

Halloween Problem and Inserts

Published 2020-07-30 by Kevin Feasel

Jared Poche continues a dive into the Halloween Problem:

I would have expected us to scan the temp table, then have a LEFT JOIN to the base table. The Table Spool is the red flag that we have an issue with the plan, and is frequently seen with Halloween protections.
The index scan on the base table seems to be overkill since we’re joining on the primary key columns (the key lookup isn’t much of a concern). But we’re likely doing the scan because of the spool; it’s SQL Server’s way of getting all relevant records in one place at one time, breaking the normal flow of row mode operation, to make sure we don’t look up the same record multiple times.

Read on to see the execution plan as well as Jared’s fix.

Comments closed

Seeks are Better than Scans, Except when they Aren’t

Published 2020-07-28 by Kevin Feasel

Hugo Kornelis explains that both seeks and scans exist for a good reason:

Fact: You should never blindly trust anything you find on the internet. And right now, you are reading the internet. So why should you trust this?
You shouldn’t. At least, not blindly. You should verify. And what better way to verify then through demos!

There went my strategy of blindly trusting Hugo.

The rule of thumb I have heard and go by is, if you’re retrieving less than 1/2 of 1% of data, a seek is the best route. If you’re returning more than 20% of data, a scan is the best route. In between is the “it depends” zone, where either could potentially be better. But please do read Hugo’s post—it’s an important one for query tuners.

Comments closed

Getting the Last Query Plan Stats in SQL Server 2019

Published 2020-07-20 by Kevin Feasel

John Morehouse walks us through retrieving the actual query plan stats of the last run of an execution plan:

Currently, if you are not on SQL Server 2019 and wanted to see an execution plan, you would attempt to dive into the execution plan cache to retrieve an estimated plan. Keep in mind that this just an estimated plan and the actual plan, while the shape should be the same, will have runtime metrics. These Actual runtime metrics could be, different than what is shown in the estimated plan, so it is important to get the actual whenever possible.
With the introduction of lightweight statistics, SQL Server can retain the metrics of the actual execution plan if enabled. Note that this could introduce a slight increase in overhead, however, I haven’t yet seen it be determinantal. These metrics are vastly important when performing query tuning.

Read on to see the specific set of metrics you can pull and how to do it. This does require SQL Server 2019.

Comments closed

When Batch Mode on Rowstore Hurts Performance

Published 2020-07-16 by Kevin Feasel

Erik Darling walks us through a scenario where batch mode on rowstore can make performance of a query worse:

I’m not mad at 2019 or Batch Mode On Rowstore (BMOR) or anything.
But if I’m gonna get into it, I’m gonna document issues I run into so that hopefully they help you out, too.
One thing I ran into recently was where BMOR kicked in for a query and made it slow down.

Click through for the scenario, why it’s slower when using batch mode, and two ways you can improve the query.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: Query Tuning