Making plan choices with IF branches like this plain doesn’t work.
The optimizer compiles a plan for both branches based on the initial compile value.
What you end up with is a stored proc that doesn’t do what it’s supposed to do, and parameter sniffing times two.
Read on to see an example of this. If you really, really want to use an IF block, you could separate the components out into individual stored procedures and call those stored procedures independently.
Figuring out exactly what causes slow performance for a stored procedure can sometimes feel like trying to unravel a ball of Clark Griswold’s Christmas lights. It’s not uncommon to see procedures with hundreds, even thousands of lines of code. You may have been told which stored procedure runs slow by a user or manager, or you might have found it by looking in SQL Server DMVs. Either way, once you have detected the offending procedure, where do you start?
If you’re running SQL Server 2016, one option is Query Store. Query Store captures individual queries, but it also captures the object_id, so you can find all the queries that are associated with an object to determine which ones are problematic.
This is quite useful when you have to tune a procedure you’ve never seen before, and as you go to open that procedure, the vertical scroll bar keeps getting smaller and smaller.
Recently Manish Kumar asked an interesting question, what do you do if your proc accesses multiple or even all the databases on the server?
So, instead of giving him a fuzzy answer in reply, I thought I’d write up exactly how you can deal with that sort of situation.
We’ve got two options and we’ll have a look at both of them (I’m not going to go into details about how signing procs works, please see the post mentioned earlier for a basic overview, here I’m going to look specifically at procs that access multiple databases).
Click through to see both solutions.
I then set up Extended Events to capture the query metrics and I executed each of the queries multiple times (also, just for the test, I discarded the results because I didn’t want that process mucking with my measurements). After executing both procedures 500 times, the results were quite simple. The average execution time with an explicit drop was 8,672 microseconds. Meanwhile, the average for not dropping the temporary table was 8,530 microseconds. That’s about a 1% difference across hundreds of executions. Reads were identical and so were writes.
In short, the behavior is the same.
What about the impact on the system? Could I see changes in memory or I/O as these different processes ran?
Grant didn’t notice any difference but check Allen White and Jay Robinson’s answers in the comments. Temp table reuse can happen (if you follow the rules) and can make a difference when a procedure is called frequently enough.
Table-valued parameters were introduced in SQL Server 2008. Before that, there were limited options to pass tabular data to stored procedures. Most developers used one of the following methods:
Data in multiple columns and rows was represented in the form of a series of parameters. However, the maximum number of parameters that can be passed to a SQL Server stored procedure is 2,100. Therefore, in the case of a large table, this method could not be used. Furthermore preprocessing is required on the server side in order to format the individual parameters into a tabular form.
Create multiple SQL statements that can affect multiple rows, such as UPDATE. The statements can be sent to the server individually or in the batched form. Even if they are sent in the batched form, the statements are executed individually on the server.
Another way is to use delimited strings or XML documents to bundle data from multiple rows and columns and then pass these text values to parameterized SQL statements or stored procedures. The drawback of this approach was that you needed to validate the data structure in order to unbundle the values.
The .NET framework then makes it easy to pass in an IEnumerable as a table-valued parameter.
Tip #1: Format Your Code
This one alone is great. Erik has several other tips as well.
We store the results in a temporary table first.
Don’t worry, that’s not the end of the post. That’s not even the point of this post. It is, however, what I was doing when I came across a weird error.
Let me show you!
INSERT EXEC failed because the stored procedure altered the schema of the target table.
He wasn’t really trying to alter the schema of that temp table, but read on to see the context and solution.
So where are we? Well, we found that Rowgroup Elimination is possible in stored procedures with ColumnStore indexes, but that the cached plan doesn’t change based on feedback from that elimination.
- Good news: elimination can occur with variables passed in.
- Bad news: that cached plan sticks with you like belly fat at a desk job
Remember our plan? It used a Stream Aggregate to process the
MAX. Stream Aggregates are preferred for small, and/or ordered sets.
Great post, Brent.
Startup procedures automatically execute whenever SQL Server is started. Where would you use this? One example is if you had an application that required the use of a global temporary table, you could use a startup procedure to create it. This would allow the table to be immediately accessible to any application that requires it.
Another possibility is that you could “warm up” the cache so that data is already cached when an application needs it. This would help prevent that initial I/O hit the first time the query is called, thus potentially improving initial performance.
Click through for details. I’ve created a couple of these, but they rarely come to mind and that’s one of the big problems: if there is an issue, there’s probably quite a bit of investigative work that would take place before somebody remembers that hey, these things exist.
But what if the caller wanted the date to be “empty” (i.e. 1900-01-01)? And what if a NULL is passed?
In our environment, we’ve disallowed NULLs from our table fields. We understand that NULL is actually information– it says that the data is unknown– but we believe that for most data fields, there are non-NULL values which just as effectively represent unknown. Typically, 0’s and empty strings (and the “blank” date 1900-01-01) serve that purpose. And those values are more forgiving during coding (they equal things; they don’t make everything else “unknown”), and we accept the risk of paying little attention to which parts of our logic touched “unknown” values.
It’s an interesting look at dealing with optional and default parameters within procedures.