Press "Enter" to skip to content

Category: Query Tuning

Troubleshooting with sp_HumanEvents

Erik Darling shows off sp_HumanEvents:

With shorter procs you can probably just collect actual execution plans and slam F5 like a tall glass of gin at 6am.

But you don’t wanna do that with the larger procs, for a few practical reasons:

– Lots of little queries run quickly, and we don’t care about those

– Navigating through lots of plans in SSMS is tedious

– There’s no differentiation when other procedures, etc. are invoked

– You introduce a lot of overhead retrieving and rendering all those plans

– The full query text might not be captured, which is a limitation in many places

Let’s save the day with sp_HumanEvents, my stored procedure that makes using Extended Events really easy.

Read on to see how this all works.

Comments closed

Calculating the Adaptive Join Threshold

Paul White breaks out the math books:

One thing I want you to bear in mind throughout this piece is an adaptive join always starts executing as a batch mode hash join. This is true even if the execution plan indicates the adaptive join expects to run as a row mode apply.

Like any hash join, an adaptive join reads all rows available on its build input and copies the required data into a hash table. The batch mode flavour of hash join stores these rows in an optimized format, and partitions them using one or more hash functions. Once the build input has been consumed, the hash table is fully populated and partitioned, ready for the hash join to start checking probe-side rows for matches.

This is the point where an adaptive join makes the decision to proceed with the batch mode hash join or to transition to a row mode apply. If the number of rows in the hash table is less than the threshold value, the join switches to an apply; otherwise, the join continues as a hash join by starting to read rows from the probe input.

Read the whole thing and learn more about the cutoffs for adaptive joins.

Comments closed

Restrictions for Parallel Insertion

Erik Darling summarizes the main man:

I’d like to start this post off by thanking my co-blogger Joe Obbish for being lazy and not blogging about this when he first ran into it three years ago.

Now that we’re through with pleasantries, let’s talk turkey.

Over in this post, by Arvind Shyamsundar, which I’m sure Microsoft doesn’t consider official documentation since it lacks a GUID in the URL, there’s a list of… things about parallel inserts.

Read on for the summary of limitations.

Comments closed

More Efficient Pivoting

Dave Mason is on the hunt:

While working with some poorly performing code in T-SQL that used a PIVOT operator, I wondered if there was a more efficient way to get a result set of pivoted data. It may have been a fool’s errand, but I still wanted to try. It dawned on me that I could use the STRING_AGG() function to build a delimited list of pivoted column names and values. From there, I’d have to “shred” the delimited data to rows and columns. But how?

Read on to see how.

Comments closed

Comparing OPENJSON vs Other List-Parsing Methods in SQL Server

Aaron Bertrand splits a string:

I have long advocated avoiding splitting strings by using table-valued parameters (TVPs), but this is not always a valid option; for example, PHP drivers do not handle this feature yet. A new pattern I’ve seen emerging is to replace splitting comma-separated strings with the new OPENJSON functionality introduced in SQL Server 2016. I wanted to explore why this is not an improvement in the typical case, unless you are using a client application platform that doesn’t support TVPs and your application data already happens to be in JSON format.

Read on for the solution, which has a somewhat surprising outcome.

Comments closed

Optimizing Blob Storage Query Performance

Dennes Torres compares several strategies for querying data stored in Azure Blob Storage:

In the third part of the series Querying Blob Storage with SQL, I will focus on the performance behaviour of queries: What makes them faster, slower, and some syntax beyond the basics.

The performance tests in this article are repeated, and the best time of the queries is recorded. This doesn’t mean you will always achieve the same timing. Many architectural details will affect the timing, such as cache, first execution, and so on. The timing exposed on each query is only a reference pointing to the differences of the query methods that can affect the time and the usual result for better or worse performance.

Click through to see which patterns perform well and which don’t.

Comments closed

External Temp Tables and Plan Reuse

David Fowler has a warning about stored procedures which use temp tables created by other processes:

Here’s an interesting issue that recently came up. We were seeing very high compilations and recompilations on a server to the point that it started causing us some very serious issues (admittedly this wasn’t the sole issue but it was certainly a contributing factor, the other factors were also very interesting so I might look at those in another post).

After looking in the plan cache we could see a very high number of single use plans for a particular stored procedure. Now as you probably know, SQL will usually cache an execution plan and use it over and over whenever a particular query runs. SQL’s lazy and it doesn’t want to bother compiling queries unless it really has to.

So what was going on, why wasn’t SQL able to reuse the cached plan?

Read on for the solution, as well as the impact of the problem and ways to work around it.

Comments closed

Spill-Based tempdb Contention

Erik Darling wants you to use a coaster:

What was puzzling them was that nothing in the stored procedure did anything with temporary objects, cursors, or any of the other obvious things that go to el baño público. And the query plan itself was a pretty simple little loopy thing.

It wasn’t until I grabbed a quick sample of actual execution plans with Extended Events that the problem became more obvious.

Despite the fact that the query was only passing around 7 rows, we were getting a one row estimate, and there was a spilling Sort operator.

Click through for an example of what happened, as well as how you can fix it.

Comments closed

Optimizing for Mediocre

Erik Darling points out an issue with some approaches to preventing parameter sniffing problems in queries:

Despite the many metric tons of blog posts warning people about this stuff, I still see many local variables and optimize for unknown hints. As a solution to parameter sniffing, it’s probably the best choice 1/1000th of the time. I still end up having to fix the other 999/1000 times, though.

In this post, I want to show you how using either optimize for unknown or local variables makes my job — and the job of anyone trying to fix this stuff — harder than it should be.

Click through for two methods, both of which end up being the wrong answer.

Comments closed