Press "Enter" to skip to content

Category: Performance Tuning

Batch Mode Memory Fractions

Joe Obbish explains what memory fractions are and how incorrect calculations can lead to tempdb spills:

There’s very little information out there about memory fractions. I would define them as information in the query plan that can give you clues about each operator’s share of the total query memory grant. This is naturally more complicated for query plans that insert into tables with columnstore indexes but that won’t be covered here. Most references will tell you not to worry about memory fractions or that they aren’t useful most of the time. Out of thousands of queries that I’ve tuned I can only think of a few for which memory fractions were relevant. Sometimes queries spill to tempdb even though SQL Server reports that a lot of query memory was unused. In these situations I generally hope for a poor cardinality estimate which leads to a memory fraction which is too low for the spilling operator. If fixing the cardinality estimate doesn’t prevent the spill then things can get a lot more complicated, assuming that you don’t just give up.

Extremely interesting post.

Comments closed

GROUP BY vs DISTINCT

Rob Farley looks at how GROUP BY and DISTINCT and lead you down different execution plan paths:

What I want to explore in this post is the particular example that we both used… to bring an important point that could be missed because of the similarity of our examples.

You see, we both happened to use a FOR XML concatenation query, looking back at the same table. We did this to simulate a practical GROUP BY – somewhere that you might feel like GROUP BY is useful, but you know that you’re not using an aggregate function like SUM or MAX, but there isn’t one available. Ok, for Aaron he could’ve used the really new STRING_AGG, but for the old-timer like me, having to use SQL Server 2005, that wasn’t available.

In this post, Rob looks at a different sort of example and sees a more complicated scenario unfold.

Comments closed

Understanding Row Goals

Paul White has an excellent article on row goals:

In my experience, it is very common for people to miss the impact of row goals in execution plans.

No wonder, since before SQL Server 2017 CU3, there was no documented way to see information about row goals in execution plans! That update includes:

This enhancement adds the EstimateRowsWithoutRowGoal attribute to each plan operator affected by a row goal.

The new attribute is visible in all the usual places (e.g. DMVs, plan cache) but not yet in SQL Server Management Studio graphical plans (up to and including SSMS version 17.4). The SQL Server does send the new attribute, but SSMS graphical plans strip bits out that do not match its local version of the xml plan schema.

Version 17.5 of SSMS is expected to ship with an updated xml schema and UI elements, making the new attribute visible. SentryOne Plan Explorer shows the new attribute in the raw xml view already, since it doesn’t have any reason to strip the information out, but an update will be required to incorporate the new row goal information into other places (such as the Plan Diagram). I would expect it to make its way onto tooltips first.

Definitely worth a careful read.

Comments closed

Getting SQL Server CPU Usage By Session

Manu Punna has a script to get CPU utilization by session over a relatively short timeframe:

Troubleshooting high CPU usage on a SQL Server Database is an art, but there is a defined methodology to follow to find the root cause of high CPU. This can involve breaking down the overall server CPU usage to a more granular level, first discovering that it’s SQL Server that’s the problem (because way too often it’s something else!), down to exploring specific plan operators in a particular problematic query. Finding that problematic query, identifying the high CPU consumer, means identifying the CPU usage by session.

sys.dm_exec_requests shows the CPU time, but it’s cumulative – it doesn’t give the CPU consumption by each session at the current time. You can see how much CPU usage a session has had since it started, but it doesn’t show you what’s going on right now. To explore that, we need to query sys.dm_exec_requests repeatedly, and look for the differences. We need to collect the CPU usage for a time interval to identify the high CPU consumers.

Click through for the script.

Comments closed

Ways To Hinder Indexes

Raul Gonzalez shows that even when you have a good index, “clever” developers and fate can find ways to conspire against it:

he benefits of having an index are well known, you can get the same results by reading a smaller amount of data so the improvement in performance can be from several minutes to seconds or even less.

That sounds awesome and it certainly is and there are people out there making a living of it, so it’s a huge deal for sure.

But it’s not always like that, and things can go wrong very easily and make all these shiny indexes just a pile of useless burden.

Let me show you some examples, where we can see our indexes in use, but also how they can be ignored by the query processor and become totally useless. I’m going to use the Microsoft sample database [WideWorldImporters] so you can follow along if you want.

Read on to learn more.

Comments closed

Performance Problems With ADO.Net’s AddWithValue

Dan Guzman inveighs against using AddWithValue in ADO.Net:

The nastiness with AddWithValue is that ADO.NET infers the parameter definition from the supplied object value. Parameters in SQL Server are inherently strongly-typed, including the SQL Server data type, length, precision, and scale. Types in .NET don’t always map precisely to SQL Server types, and are sometimes ambiguous, so AddWithValue has to makes guesses about the intended parameter type.

The guesses AddWithValue makes can have huge implications when wrong because SQL Server uses well-defined data type precedence rules when expressions involve unlike data types; the value with the lower precedence is implicitly converted to the higher type. The implicit conversion itself isn’t particularly costly but is a major performance concern when it is the column value rather than the parameter value must be converted, especially in a WHERE or JOIN clause predicate. The implicit column value conversion can prevent indexes on the column from being used with an index seek (i.e. non-sargable expression), resulting in a full scan of every row in the table or index.

Read the whole thing.

Comments closed

Inlining Scalar Functions

Brent Ozar covers a Microsoft paper which looks exciting:

Froid replaces the scalar UDF operator in the calling query with the newly constructed relational expression as a scalar sub-query.

That one statement comes with a lot of interesting gotchas that they discuss throughout the paper:

  • You won’t see the scalar function in the plan (just like we don’t currently see single-statement inline table-valued functions by name in the plan – we just see their effects, kinda like views)

  • The function and the underlying tables might have different permissions (you could have permissions on the function but not the tables, or vice versa, which makes compilation & execution a little trickier)

  • Code that doesn’t get used can just get removed outright (just like SQL Server can do join elimination)

  • Costs and row estimates are now useful inside the plan

To the extent that this works (and I hope it does), it helps fulfill the promise of SQL Server 2000, with encapsulation of code.  Today, one of the easiest big performance gains I can give is to strip something out of a user-defined function and inline it.

Comments closed

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions:

Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you don’t reference that column at all!

Read on for a demonstration.

Comments closed

Using Computed Columns For Pattern Matching

Erik Darling shows how you can use a computed column to speed up %something searches:

We were talking about computed columns, and one of our students mentioned that he uses computed columns that run the REVERSE() function on a column for easier back-searching.

What’s back-searching? It’s a word I just made up.

The easiest example to think about and demo is Social Security Numbers.

One security requirement is often to give the last four.

Obviously running this for a search WHERE ssn LIKE '%0000' would perform badly over large data sets.

One of the trickiest parts of performance tuning is understanding that the way people want to see data is not necessarily the way you should store the data.

Comments closed

Using Powershell To Deploy Perfmon Collectors

Raul Gonzalez has a bonus post in his Perfmon data series:

As I said, when it’s time to deploy the solution explained in my previous posts to a number of servers it might get very tedious, specially if we have servers running multiple instances, since each have different counter names because the instance name is part of that name, and if we create one template, that won’t apply to all cases, so a lot of manual intervention.

So I decided to do what I like the most and got to write some queries that combined with some powershell will do the job for me.

Read on for the script and more information.

Comments closed