Press "Enter" to skip to content

Category: Query Tuning

When a Non-Clustered Index on Clustered Columns Makes Sense

Allen White gives us a scenario where adding a non-clustered index which is the same column as the clustered index can make sense:

Recently I was asked about adding a non-clustered index to a table (let’s call it Images) with just one column. It had been added in the development database and it improved performance dramatically. I looked at it and it had the same key as the clustered index on that table.

In reviewing the query I saw that Images was joined to the other tables in the query, but none of the columns were used, so Images was joined to ensure that values from the other tables had rows in Images. The query plan shows a significantly higher number of reads against Images without the new NCI (non-clustered index) than when it’s present.

I do agree that this can help—as we obviously see. The backseat query tuner in me wonders if maybe there’s another way to write the query to prevent the scan by using CROSS APPLY, but that’d only help if they were getting a small percentage of rows from the parent table expression built from the combination of the clustered index scan and index seek in the second example.

Comments closed

The Peril of Local Variables

Erik Darling dives into the tradeoffs you make when using local variables in stored procedures to avoid parameter sniffing:

In a stored procedure (and even in ad hoc queries or within dynamic SQL, like in the examples linked above), if you declare a variable within that code block and use it as a predicate later, you will get either a fixed guess for cardinality, or a less-confidence-inspiring guess than when the histogram is used.

The local variable effect discussed in the rest of this post produces the same behavior as the OPTIMIZE FOR UNKNOWN hint, or executing queries with sp_prepare. I have that emphasized here because I don’t want to keep qualifying it throughout the post.

This deserves a careful read-through.

Comments closed

Figuring out How a Plan was Forced

Erin Stellato wants to know whether a DBA forced a plan or SQL Server did automatically:

If you use Automatic Plan Correction, and thus also Query Store, you may wonder how was a plan forced: manually or automatically with APC?  The type of forced plan is tracked in sys.query_store_plan, and you can use this simple query to determine how a plan was forced:

Click through for a simple query, as well as a more complex form which gives you a bit more info.

Comments closed

Improving Queries at the Margin

Jared Poche has a story about improving a query which is already pretty fast:

In my last post, I spoke about optimizing a procedure that was being executed hundreds of millions of times per day, and yes, that is expected behavior.

The difficult thing about trying to optimize this procedure is that it only takes 2.5ms on average to run. Tuning this isn’t a matter of changing a scan to a seek; we’ll have to look hard to find the opportunities here. A one millisecond Improvement on a procedure running 100 million times a day would save 100,000 seconds every day.

Well, I’ve found a few more options since my last post, and wanted to share my findings.

Read on to see how Jared tries to tackle one specific case.

Comments closed

Tuning a Query Searching for a Substring in Text

Eddy Djaja gives us two methods for improving performance of a search for a fixed substring:

The reason substring function is used because the column ACCOUNTDISPLYVALUE has multiple values combined in one column. In this case, the query is searching for the Account Number which is the first six characters. The long running query is listed below:

set statistics io on
go
select sum(ACCOUNTINGCURRENCYAMOUNT)from [d365].[GeneralJournalAccountMultiCompanyEntries]where substring([ACCOUNTDISPLAYVALUE], 1, 6)  = '877601'

Eddy gives us two solutions. As a quick note, these solutions work because the query is looking for a specific stretch of characters after a specific starting point. For arbitrary text, things get a little trickier.

Comments closed

Quickly Finding Distinct Values with T-SQL

Paul White does some amazing things with T-SQL, news at 11:

I will be looking for distinct values in the BountyAmount column of the dbo.Votes table, presented in bounty amount order ascending. The Votes table has just under 53 million rows (52,928,720 to be exact). There are just 19 different bounty amounts, including NULL.

The Stack Overflow 2013 database comes without nonclustered indexes to minimize download time. There is a clustered primary key index on the Id column of the dbo.Votes table. It comes set to SQL Server 2008 compatibility (level 100), but we will start with a more modern setting of SQL Server 2017 (level 140):

Getting the query down from 10.5 seconds to 1ms is crazy.

Comments closed

Tracking Performance of Queries which use RECOMPILE Hints

Brent Ozar has some tips if you use RECOMPILE hints frequently:

The first query’s plan stuck around in memory, so it now shows 2 executions, and 2 total rows returned. Its row metrics are correct through the life of the stored procedure’s time in cache.

However, the second query – the one with the recompile hint – has a brand new plan in the cache, but also new metrics. You’re not just recompiling the execution plan, but you’re also not getting query plan metrics here. (That’s fine, and that part I was also kinda aware of.)

But the part that I keep forgetting is that when I’m looking at the stored procedure’s totals in sp_BlitzCache, the total, min, and max values are useless:

If the plan cache isn’t going to help, what will? Brent tells you exactly what.

Comments closed

ASYNC_NETWORK_IO and Execution Plans

Jonathan Kehayias dives into an interesting problem:

A few weeks ago, an interesting question was asked on the #SQLHelp hash tag on Twitter about the impact of execution plans on the ASYNC_NETWORK_IO wait type, and it generated some differing opinions and a lot of good discussion.

My immediate answer to this would be that someone is misinterpreting the cause and effect of this, since the ASYNC_NETWORK_IO wait type is encountered when the Engine has results to send over TDS to the client but there are no available TDS buffers on the connection to send them on. Generally speaking, this means that the client side is not consuming the results efficiently, but based on the ensuing discussion I became intrigued enough to do some testing of whether or not an execution plan would actually impact the ASYNC_NETWORK_IO waits significantly.

To summarize: Focusing on ASYNC_NETWORK_IO waits alone as a tuning metric is a mistake. The faster a query executes, the higher this wait type will likely accumulate, even if the client is consuming results as fast as possible. (Also see Greg’s recent post about focusing on waits alone in general.)

Click through for the things Jonathan tested.

Comments closed

Approximate Distinct Count with DAX

Gilbert Quevauvilliers runs some performance tests against the approximate distinct count formula in DAX:

I am currently running SQL Server Analysis Services (SSAS) 2019 Enterprise Edition. (This can also be applied to Power BI)

My Fact table has got roughly 950 Million rows stored in

And as mentioned previously it has got over 64 Million distinct users.

The data is queried from SQL Server into SSAS.

Gilbert first checks how close these are and then how much faster the approximate count is.

Comments closed