Query Tuning – Page 34

Searching Text which Begins with a Wildcard

Published 2021-02-18 by Kevin Feasel

Searching for a value or group of values with a wildcard is more than just putting a % on both sides of a text string. If you know you’re looking for all strings in a name field that start with the name “Chad” then you are you really shooting yourself in the foot by using ‘%Chad%’ instead of ‘Chad%’ in your query. SQL Server is going to be scanning the table instead of being able to use an index to seek to the data. While that may work to get your result, it’s likely going to take longer and be more invasive than needed. I want to go through an example of how using a reversed column can improve SQL Server’s ability to build a better execution plan.

Read on to see how a computed reversal column can help. For more complex scenarios, an n-gram table (for example, a trigram) might help, though there’s a lot of setup involved there.

Comments closed

Join Algorithm Selection in Spark

Published 2021-02-17 by Kevin Feasel

The Hadoop in Real World team takes us through the selection criteria for join types:

There are several factors Spark takes into account before deciding on the type of join algorithm to use to join datasets at runtime.
Spark has the following 5 algorithms to choose from –
1. Broadcast Hash Join
2. Shuffle Hash Join
3. Shuffle Sort Merge Join
4. Broadcast Nested Loop Join
5. Cartesian Product Join (a.k.a Shuffle-and-Replicate Nested Loop Join)

Read on to learn which join types are supported in which circumstances, as well as rules of precedence.

Comments closed

Sorting Pre-Sorted Data

Published 2021-02-16 by Kevin Feasel

Daniel Hutmacher has an idea:

Whenever SQL Server needs to sort a data stream, it will use the Sort operator to reorder the rows of the stream. Sorting data is an expensive operation because it entails loading part or all of the data into memory and shifting that data back and forth a couple of times. The only time SQL Server doesn’t sort the data is when it already knows the data to be ordered correctly, like when it has already passed a Sort operator or it’s reading from an appropriately sorted index.
But what happens if the data is ordered correctly, but SQL Server doesn’t know about it? Let’s find out.

Click through for the answer.

Comments closed

More Number Series Generators

Published 2021-02-11 by Kevin Feasel

Itzik Ben-Gan continues a series:

This is the second part in a series about solutions to the number series generator challenge. Last month I covered solutions that generate the rows on the fly using a table value constructor with rows based on constants. There were no I/O operations involved in those solutions. This month I focus on solutions that query a physical base table that you pre-populate with rows. For this reason, beyond reporting the time profile of the solutions like I did last month, I’ll also report the I/O profile of the new solutions. Thanks again to Alan Burstein, Joe Obbish, Adam Machanic, Christopher Ford, Jeff Moden, Charlie, NoamGr, Kamil Kosno, Dave Mason, John Nelson #2 and Ed Wagner for sharing your ideas and comments.

Read on for three more solutions, as well as a re-evaluation of the solutions in the first article.

Comments closed

Join Hints in Spark 3

Published 2021-02-10 by Kevin Feasel

The Hadoop in Real World team shares some information on join hits in Spark SQL:

The join side with the hint will be broadcast regardless of the size limit specified in spark.sql.autoBroadcastJoinThreshold property. If both sides of the join have the broadcast hints, the one with the smaller size (based on stats) will be broadcast.

Click through for examples of the four categories of join hints.

Comments closed

Performance Comparison of ISNULL and COALESCE

Published 2021-02-01 by Kevin Feasel

Erik Darling notes the edge cases where ISNULL() can be faster than COALESCE():

Sometimes there are very good reasons to use either coalesce or isnull, owing to them having different capabilities, behaviors, and support across databases.
But isnull has some particular capabilities that are interesting, despite its limitations: only two arguments, specific to SQL Server, and uh… well, we can’t always get three reasons, as a wise man once said.
There is one thing that makes isnull interesting in certain scenarios. Let’s look at a couple.

Read the whole thing. You (hopefully) won’t be in this situation often, but if you do happen to find yourself in it for whatever reason (and you can’t re-write the query to something better), it’s good to keep this in mind.

Comments closed

Using Hints Instead of Trace Flags

Published 2021-01-28 by Kevin Feasel

Taiob Ali has a set of hints to use instead of setting trace flags:

Introduced in SQL Server 2016 (13.x) SP1 and Azure SQL Database is the USE HINT option. At the time of writing this article, there are 23 of those. You can see an up to date list by running Dynamic Management View sys.dm_exec_valid_use_hints.
With the introduction of this feature, we can replace some of the trace flags with hints. I personally like hints over trace flags (if I have to choose between two devils). That way someone reading the code have some idea what the hint is about instead of remembering the trace flag numbers.

I agree with Taiob on that sentiment.

Comments closed

The Performance of sys.partitions

Published 2021-01-28 by Kevin Feasel

Aaron Bertrand tries to disentangle a query:

This question was posted to #sqlhelp by Jake Manske, and it was brought to my attention by Erik Darling.
I don’t recall ever having a performance issue with sys.partitions. My initial thought (echoed by Joey D’Antoni) was that a filter on the data_compression column should avoid the redundant scan, and reduce query runtime by about half. However, this predicate doesn’t get pushed down, and the reason why takes a bit of unpacking.

Click through for the results of Aaron’s research.

Comments closed

Faster String Concatenation with SQL Server

Published 2021-01-27 by Kevin Feasel

Steve Stedman has some tips for people who need to combine strings:

What the previous show as that the longer the string gets the slower the concatenation is. So instead we declare a second VARCHAR(MAX) variable called @stringBuilder, and each time through the loop we concatenate to that, then every thousandth time through the loop we take the @stringBuilder variable and concatenate it on to the @bigString, then clear out the @stringBuilder variable. This keeps the @stringBuilder variable relatively short, and reduces the number of concatenations to the @bigString to roughly 1/1000th the original.

Click through for a demo of the process. I don’t think I’ve been in too many situations where string concatenation was a performance killer in SQL Server, but it’s good to know.

Comments closed

Using sp_prepare with Plan Guides

Published 2021-01-13 by Kevin Feasel

Aaron Bertrand tries something different:

There are features many of us shy away from, like cursors, triggers, and dynamic SQL. There is no question they each have their use cases, but when we see a trigger with a cursor inside dynamic SQL, it can make us cringe (triple whammy).
Plan guides and sp_prepare are in a similar boat: if you saw me using one of them, you’d raise an eyebrow; if you saw me using them together, you’d probably check my temperature. But, as with cursors, triggers, and dynamic SQL, they have their use cases. And I recently came across a scenario where using them together was beneficial.

Read on to see the method to this madness.

Comments closed

Category: Query Tuning