Press "Enter" to skip to content

Category: Performance Tuning

See The Pernicious Effects Of Your UDFs

Pedro Lopes announces an improvement to SQL Server execution plan results in 2017 CU3:

As I mentioned on yesterday’s post, with the recent release of SQL Server 2017 CU3, we released yet more showplan enhancements: you can see other posts related to showplan enhancements here.

In this article I’ll talk about the second showplan improvement we worked on, to assist in the discoverability of UDF usage impact on query execution.

The scenario is that if a query uses one or more user-defined scalar functions (such as T-SQL functions or CLR functions), a significant portion of query execution time may be spent inside those functions, depending on the function definition. And this may be not immediately obvious by looking at the query execution plan.

Recently, we added information on overall query CPU and elapsed time tracking for statistics showplan xml (both in ms), found in the root node of an actual plan (on which I blogged about here). We now added two new attributes: UdfCpuTime and UdfElapsedTime. These provide the total CPU and elapsed time (again, both in ms) that is spent inside all scalar user-defined functions, during the execution of a query.

I love it.  UDFs have historically been silent query killers, as the execution plan would gleefully think that the function call is practically free because it’d only show a single iteration.

Comments closed

The Downside Of Nested Views

Randolph West doesn’t mince words:

Nested views are bad. Let’s get that out of the way.

What is a nested view anyway? Imagine that you have a SELECT statement you tend to use all over the place (a very common practice when checking user permissions). There are five base tables in the join, but it’s fast enough.

Instead of copying and pasting the code wherever you need it, or using a stored procedure (because for some reason you’re allergic to stored procedures), you decide to simplify your code and re-use that SELECT in the form of a database view. Now whenever you need to run that complicated query, you can instead query the view directly.

What harm could this do?

Spoilers:  a lot.

Comments closed

Batch Mode Memory Fractions

Joe Obbish explains what memory fractions are and how incorrect calculations can lead to tempdb spills:

There’s very little information out there about memory fractions. I would define them as information in the query plan that can give you clues about each operator’s share of the total query memory grant. This is naturally more complicated for query plans that insert into tables with columnstore indexes but that won’t be covered here. Most references will tell you not to worry about memory fractions or that they aren’t useful most of the time. Out of thousands of queries that I’ve tuned I can only think of a few for which memory fractions were relevant. Sometimes queries spill to tempdb even though SQL Server reports that a lot of query memory was unused. In these situations I generally hope for a poor cardinality estimate which leads to a memory fraction which is too low for the spilling operator. If fixing the cardinality estimate doesn’t prevent the spill then things can get a lot more complicated, assuming that you don’t just give up.

Extremely interesting post.

Comments closed

GROUP BY vs DISTINCT

Rob Farley looks at how GROUP BY and DISTINCT and lead you down different execution plan paths:

What I want to explore in this post is the particular example that we both used… to bring an important point that could be missed because of the similarity of our examples.

You see, we both happened to use a FOR XML concatenation query, looking back at the same table. We did this to simulate a practical GROUP BY – somewhere that you might feel like GROUP BY is useful, but you know that you’re not using an aggregate function like SUM or MAX, but there isn’t one available. Ok, for Aaron he could’ve used the really new STRING_AGG, but for the old-timer like me, having to use SQL Server 2005, that wasn’t available.

In this post, Rob looks at a different sort of example and sees a more complicated scenario unfold.

Comments closed

Understanding Row Goals

Paul White has an excellent article on row goals:

In my experience, it is very common for people to miss the impact of row goals in execution plans.

No wonder, since before SQL Server 2017 CU3, there was no documented way to see information about row goals in execution plans! That update includes:

This enhancement adds the EstimateRowsWithoutRowGoal attribute to each plan operator affected by a row goal.

The new attribute is visible in all the usual places (e.g. DMVs, plan cache) but not yet in SQL Server Management Studio graphical plans (up to and including SSMS version 17.4). The SQL Server does send the new attribute, but SSMS graphical plans strip bits out that do not match its local version of the xml plan schema.

Version 17.5 of SSMS is expected to ship with an updated xml schema and UI elements, making the new attribute visible. SentryOne Plan Explorer shows the new attribute in the raw xml view already, since it doesn’t have any reason to strip the information out, but an update will be required to incorporate the new row goal information into other places (such as the Plan Diagram). I would expect it to make its way onto tooltips first.

Definitely worth a careful read.

Comments closed

Getting SQL Server CPU Usage By Session

Manu Punna has a script to get CPU utilization by session over a relatively short timeframe:

Troubleshooting high CPU usage on a SQL Server Database is an art, but there is a defined methodology to follow to find the root cause of high CPU. This can involve breaking down the overall server CPU usage to a more granular level, first discovering that it’s SQL Server that’s the problem (because way too often it’s something else!), down to exploring specific plan operators in a particular problematic query. Finding that problematic query, identifying the high CPU consumer, means identifying the CPU usage by session.

sys.dm_exec_requests shows the CPU time, but it’s cumulative – it doesn’t give the CPU consumption by each session at the current time. You can see how much CPU usage a session has had since it started, but it doesn’t show you what’s going on right now. To explore that, we need to query sys.dm_exec_requests repeatedly, and look for the differences. We need to collect the CPU usage for a time interval to identify the high CPU consumers.

Click through for the script.

Comments closed

Ways To Hinder Indexes

Raul Gonzalez shows that even when you have a good index, “clever” developers and fate can find ways to conspire against it:

he benefits of having an index are well known, you can get the same results by reading a smaller amount of data so the improvement in performance can be from several minutes to seconds or even less.

That sounds awesome and it certainly is and there are people out there making a living of it, so it’s a huge deal for sure.

But it’s not always like that, and things can go wrong very easily and make all these shiny indexes just a pile of useless burden.

Let me show you some examples, where we can see our indexes in use, but also how they can be ignored by the query processor and become totally useless. I’m going to use the Microsoft sample database [WideWorldImporters] so you can follow along if you want.

Read on to learn more.

Comments closed

Performance Problems With ADO.Net’s AddWithValue

Dan Guzman inveighs against using AddWithValue in ADO.Net:

The nastiness with AddWithValue is that ADO.NET infers the parameter definition from the supplied object value. Parameters in SQL Server are inherently strongly-typed, including the SQL Server data type, length, precision, and scale. Types in .NET don’t always map precisely to SQL Server types, and are sometimes ambiguous, so AddWithValue has to makes guesses about the intended parameter type.

The guesses AddWithValue makes can have huge implications when wrong because SQL Server uses well-defined data type precedence rules when expressions involve unlike data types; the value with the lower precedence is implicitly converted to the higher type. The implicit conversion itself isn’t particularly costly but is a major performance concern when it is the column value rather than the parameter value must be converted, especially in a WHERE or JOIN clause predicate. The implicit column value conversion can prevent indexes on the column from being used with an index seek (i.e. non-sargable expression), resulting in a full scan of every row in the table or index.

Read the whole thing.

Comments closed

Inlining Scalar Functions

Brent Ozar covers a Microsoft paper which looks exciting:

Froid replaces the scalar UDF operator in the calling query with the newly constructed relational expression as a scalar sub-query.

That one statement comes with a lot of interesting gotchas that they discuss throughout the paper:

  • You won’t see the scalar function in the plan (just like we don’t currently see single-statement inline table-valued functions by name in the plan – we just see their effects, kinda like views)

  • The function and the underlying tables might have different permissions (you could have permissions on the function but not the tables, or vice versa, which makes compilation & execution a little trickier)

  • Code that doesn’t get used can just get removed outright (just like SQL Server can do join elimination)

  • Costs and row estimates are now useful inside the plan

To the extent that this works (and I hope it does), it helps fulfill the promise of SQL Server 2000, with encapsulation of code.  Today, one of the easiest big performance gains I can give is to strip something out of a user-defined function and inline it.

Comments closed

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions:

Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you don’t reference that column at all!

Read on for a demonstration.

Comments closed