This optimisation is really silly, but hey, why not. If users write impossible predicates, then why even execute them? Here are some examples:-- "Obvious" SELECT * FROM actor WHERE 1 = 0 -- "Subtle" SELECT * FROM actor WHERE NULL = NULL
The first query should obviously never return any results, but the same is true for the second one, because while
NULL IS NULLyields
NULL = NULLevaluates to
NULL, which has the same effect as
FALSEaccording to three-valued logic.
This doesn’t need much explanation, so let’s immediately jump to see which databases optimise this:
I was a bit surprised at how well DB2 did in this set.
To access QPI, you simply need to click on the database you want to work with. Once you click on your database, scroll down in the portal to Query Performance Insight(QPI). Once QPI opens, you will see three options to sort on: CPU, DATA I/O, and LOG I/O. You can also set the timeframe to view, I set for 24 hours. Now, I have my timeline of 24 hours, and I am able to view which queries had the highest DATA I/O. I made a list of the top 3 from each category(CPU, DATA I/O, and LOG I/O) and presented it to my client. I presented the number of times it was executed, and the usage it utilized each time(all from the QPI information). The client then sent me 10 queries they wanted tuned and listed them in a prioritized list.
Well, by the end of tuning their 3 highest priority queries, we removed over 10 billion logical reads! Yep, 10 BILLION! The client was very happy with our results and is currently awaiting the preview Standard Elastic Pools to come out of Preview and become GA. I have provided a few screenshots of an AdventureWorksLT database on my personal instance just to show you how to access QPI, and change metrics.
Click through for a demo.
You look at the numbers again, and now you find that disk latency, which had previously been fine, is now completely in the tank during the business day, showing that I/O delays are through the roof.What happened?This demonstrates the concept of shifting bottleneck – while CPU use was through the roof, the engine so bogged down that it couldn’t generate that much I/O, but once the CPU issue was resolved queries started moving through more quickly until the next choke point was met at the I/O limit. Odds are once you resolve the I/O situation, you would find a new bottleneck.How do you ever defeat a bad guy that constantly moves around and frequently changes form?
Click through for some pointers on disk latency and trying to figure out when it becomes a problem.
Note that Query 1 uses a function around the column and this causes a scan which increases IO and CPU utilization. This is because every value in the column for the whole table must have the hyphen removed to see if it is a match.
In Query 2, we see a seek. This is because the value is modified instead of the column.
Click through for a few examples.
So where are we? Well, we found that Rowgroup Elimination is possible in stored procedures with ColumnStore indexes, but that the cached plan doesn’t change based on feedback from that elimination.
- Good news: elimination can occur with variables passed in.
- Bad news: that cached plan sticks with you like belly fat at a desk job
Remember our plan? It used a Stream Aggregate to process the
MAX. Stream Aggregates are preferred for small, and/or ordered sets.
Great post, Brent.
Lots of people are confused by how to deal with bad parameter sniffing when it occurs. In an effort to help with this, I’m going to try to make a decision flow chart to walk you through the process. This is a rough, quite rough, first draft.
I would love to hear any input. For this draft, I won’t address the things I think I’ve left out. I want to see what you think of the decision flow and what you think might need to be included. Click on it to embiggen.
I think it’s a great first step. I think a decision to add local variables and use them instead of parameters would be useful, particularly in contrast to using RECOMPILE and OPTIMIZE FOR UNKNOWN.
Coming in at number five and looking alive! Did you know that query plans can be different on busy servers? I bet not! And aside from that, your performance problem might not even be the query itself, it may be blocking, or a poison wait. This stuff may not show up in Dev, unless yous spend a lot of time and money engineering load tests.
This is what safety groups call “Situational Awareness”, and this is the kind of stuff that you really want a monitoring tool in place for. Sure, that query ran slowly, but if that’s all you know, and you can’t reproduce it, then you need to start digging deeper.
There are a number of tips here, and that number is five.
Back in 2007, we released the Microsoft SQL Server 2005 Performance Dashboard Reports, which were designed to provide fast insight into performance issues from some newly created system views – DMFs (Dynamic Management Views). These were updated for SQL Server 2008 and later to SQL Server 2012, and while being very helpful they had a significant drawback – required separate download and install. This meant that when needed, most probably they were not installed in a specific SQL Server, and therefore were unusable when they were needed the most.
With the new SSMS 17.2, we are releasing the Performance Dashboard embedded as a built-in Standard Report. This means that it is available for any SQL Server instance starting with SQL Server 2008, without any extra downloads or running any extra scripts. Just connect to your server in SSMS and open the Performance Dashboard.
Aside from making it built into Management Studio, they’ve also added a few helpful things to the product, so it is worth checking out.
Why? Because SQL is performing that implicit conversion to the numeric datatype for every single row in my table. Hence, it can’t seek using the index because it ends up having to scan the whole table to convert every record to a number first.
And this doesn’t only happen with numbers and string conversion. Microsoft has posted an entire chart detailing what types of data type comparisons will force an implicit conversion:
This is one of those things that can easily elude you because the query will often return results in line with what you expect, so until you have a performance problem, you might not even think to check.
There are two types of Collations in SQL Server: SQL Server Collations and Windows Collations:
SQL Server Collations (those starting with SQL_) are older Collations that were the only ones available prior to SQL Server 2000. These Collations use simplistic sort orders, and do not handle the great variety of linguistic rules defined by Unicode. In fact, they do not have any Unicode rules defined at all, so N[VAR]CHAR data in these Collations will actually use OS-level Collation rules.
Windows Collations (those not starting with SQL_) were introduced in SQL Server 2000. These Collations not only have the Unicode rules defined, but they also apply those same linguistic rules to [VAR]CHAR data. While this does come at a slight cost to performance, it also allows for consistency of behavior. And it is this consistency that helps out greatly when there is a mismatch of datatypes (as we will see in a moment).
The simple advice to avoid mixing NVARCHAR and VARCHAR data types is still sound, but do read the whole thing.