Press "Enter" to skip to content

Category: Performance Tuning

Performance Testing Scalar UDF Improvements

Wayne Sheffield tests out the scalar UDF performance improvements in SQL Server 2019:

This blog post will examine changes to the query plan and performance when Scalar UDF Inlining is occurring.

I have previously blogged about function performance – here and here. For a quick recap, the performance test ranks these function in duration. The order of the types of functions by duration is Inline TVF, Scalar UDF, and then finally a Multi-Statement TVF (MSTVF) – and the MSTVF is way behind the other two types of functions.

I’m using a Linux (Ubuntu) VM with SQL Server 2019 to perform these comparison performance tests. I use one database in the SQL 2019 compatibility level, and another one in the SQL 2017 compatibility level. I’m using the same performance test used in the previous blog posts.

Wayne finds a definite performance improvement, but not enough in my mind to start creating a bunch of these.

Comments closed

Wait Stats And Missing Indexes

Arthur Daniels explains that missing indexes can cause high wait stat counts to appear:

At first, this statement might sound a bit confusing. Usually, we expect wait statistics to show us what a query is waiting on, whether it’s waiting on memory, loading pages from disk, or any of the other numerous wait types.
Once you start collecting wait statistics, you’ll have a lot of data to sort through. You might find waits like CX_PACKET, CX_CONSUMER, and PAGEIOLATCH. Surprisingly, these could mean that your databases aren’t well indexed.

This makes sense. At its core, wait stats tell you where SQL Server is hurting: where is the bottleneck. But just like a person at the doctor, SQL Server can only be so specific in how it relates this pain to you, and that specificity generally boils down to hardware components. The solution might be “get more hardware,” but as Arthur points out, writing better queries and using better indexes can mitigate those pains too.

Comments closed

Monitoring Entity Framework

Grant Fritchey loves Entity Framework:

Yes, Entity Framework will improve your job quality and reduce stress in your life.

With one caveat, it gets used correctly.

That’s the hard part right? There is tons of technology that makes things better, if used correctly. There are all sorts of programs that make your life easier, if used correctly. Yet, all of these, used incorrectly, can make your life a hell.

One nit that I’ve always had with Entity Framework is that it’s very difficult to tell what part of the code the call was coming from. You really have no idea. So when my friend, Chris Woodruff, asked me on Twitter what would be the best way to monitor TagWith queries in Entity Framework, well, first, I had to go look up what TagWith was, then I got real excited, because, hey, here’s a solution.

That “I love Entity Framework” is the lead-in to a one-act play of mine with people with pitchforks, tar, and feathers. Nevertheless, Grant shows us how we can tag code in C# and capture that data in extended events. I’d read it but I’m too busy sharpening my pitchfork.

Comments closed

Early Thoughts On Scalar UDF Inlining

Aaron Bertrand shares some early thoughts on a SQL Server 2019 improvement:

Since, in spite of our best efforts since SQL Server 2000, we can’t effectively stop scalar UDFs from being used, wouldn’t it be great to make SQL Server simply handle them better?

SQL Server 2019 introduces a new feature called Scalar UDF Inlining. Instead of keeping the function separate, it is incorporated into the overall plan. This leads to a much better execution plan and, in turn, better runtime performance.

Read the whole thing. Especially the part about how this changes the way one of the DMV counters behaves.

Comments closed

Resource Semaphore Waits

Arthur Daniels explains what the RESOURCE_SEMAPHORE wait type is with an example:

So according to this DMV, there’s only 3 queries with memory grants, while the remaining 5 queries have to wait for space in this semaphore. This is where the wait type comes in. When a query is sitting as a waiter, it will display the wait type RESOURCE_SEMAPHORE.

This is one of the biggest performance-killing waits you can find, and there are several ways to tackle it in SQL Server, as well as adding more hardware.

Comments closed

Iterative Solutions To The Closest Match Problem

Itzik Ben-Gan has a follow-up article looking at row-by-row solutions to the closest match problem:

Last month, I covered a puzzle involving matching each row from one table with the closest match from another table. I got this puzzle from Karen Ly, a Jr. Fixed Income Analyst at RBC. I covered two main relational solutions that combined the APPLY operator with TOP-based subqueries. Solution 1 always had quadratic scaling. Solution 2 did quite well when provided with good supporting indexes, but without those indexes also had quadric scaling. In this article I cover iterative solutions, which despite being generally frowned upon by SQL pros, do provide much better scaling in our case even without optimal indexing.

Itzik has three separate solutions here, including one using the CLR.

Comments closed

Speeding Up The First Responder Power BI Interface

Kellyn Pot’vin-Gorman hits the Go Faster button:

The gist of this kit is that it is a database repository as part of the sp_BlitzFirst to collect monitoring alerting and performance metric data. Once you’ve set this up, then you can use a Power BI desktop dashboard as an interface for all that data.
Now this is an awesome way to introduce more DBAs to Power BI and it’s a great way to get more out of your metrics data. The challenge is, it’s a lot of data to be performing complete refreshes on and the natural life of a database like this is growth. The refresh on the static database I was sent by Tracy, once I connected my PBIX to the local db sources, took upwards of an hour to refresh. Keep in mind, I have 16G of memory, 32G of swap and have upped my data load options in Power BI quite high.

Kellyn walks through the things she did to improve performance as a starting point, so check it out and be aware that there’s even more that can be done.

Comments closed

Writing Vectorized Code In R

John Mount helps us understand writing R code like a native:

This sort of difference, scalar oriented C++ being so much faster than scalar oriented R, is often distorted into “R is slow.”
This is just not the case. If we adapt the algorithm to be vectorized we get an R algorithm with performance comparable to the C++implementation!
Not all algorithms can be vectorized, but this one can, and in an incredibly simple way. The original algorithm itself (xlin_fits_R()) is a bit complicated, but the vectorized version (xlin_fits_V()) is literally derived from the earlier one by crossing out the indices. That is: in this case we can move from working over very many scalars (slow in R) to working over a small number of vectors (fast in R).

This is akin to writing set-based SQL instead of cursor-based SQL: you’re thinking in terms which make it easier for the interpreter (or optimizer, in the case of a database engine) to operate quickly over your inputs. It’s also one of a few reasons why I think learning R makes a lot of sense when you have a SQL background.

Comments closed

Performance Impact With Nullable Columns

Daniel Hutmacher shows us situations in which nullable columns can lead to worse performance:

We expected the Semi Join to turn into an Anti Semi Join, but the plan now also contains a Nested Loop branch with a Row Count Spool – what’s that about? Turns out the Row Count Spool, along with its index seek, has to do with the NOT IN() and the fact that we’re looking at a nullable column. Remember that…
x NOT IN (y, z, NULL)
… always returns false, because the NULL value could represent essentially anything, including the x. And so it is with the inner table, if there happens to be a NULL value among those rows.

Definitely worth the read. There are ways around this performance hit but design decisions have consequences.

Comments closed

Returning NULL on NULL Input In UDFs

Jonathan Kehayias shows us a performance improvement you can get if your user-defined function is expected to return NULL if you pass in NULLs for inputs:

I was really curious about the RETURNS NULL ON NULL INPUT function option so I decided to do some testing. I was very surprised to find out that it’s actually a form of scalar UDF optimization that has been in the product since at least SQL Server 2008 R2.
It turns out that if you know that a scalar UDF will always return a NULL result when a NULL input is provided then the UDF should ALWAYS be created with the RETURNS NULL ON NULL INPUT option, because then SQL Server doesn’t even run the function definition at all for any rows where the input is NULL – short-circuiting it in effect and avoiding the wasted execution of the function body.

The more often you pass in NULL to that function, the better your performance will be relative to the default case.

Comments closed