Stored Procedure IF Branching and Performance

Erik Darling explains that the IF block in a stored procedure won’t help you with performance:

Making plan choices with IF branches like this plain doesn’t work.
The optimizer compiles a plan for both branches based on the initial compile value.
What you end up with is a stored proc that doesn’t do what it’s supposed to do, and parameter sniffing times two.

Read on to see an example of this. If you really, really want to use an IF block, you could separate the components out into individual stored procedures and call those stored procedures independently.

Digging Into DBCC CHECKIDENT

Solomon Rutzky covers the four primary scenarios when running DBCC CHECKIDENT and specifying a new reseed value:

So, when specifying a “new_reseed_value“, the possible scenarios covered are:
1. Rows exist
2. No rows due to none inserted since the table was created
3. No rows due to TRUNCATE TABLE operation


What’s missing? The following scenario:
No rows due to DELETE operation!!

Click through to see how DBCC CHECKIDENT behaves differently depending upon the scenario.

Using SWITCHOFFSET

Doug Kline has a video and T-SQL script around date/time offsets and particularly the SWITCHOFFSET function:

— so, before SWITCHOFFSET existed, …

SELECT SWITCHOFFSET(SYSDATETIMEOFFSET(),'-05:00') AS [EST the easy way], TODATETIMEOFFSET(DATEADD(HOUR, -5, SYSDATETIMEOFFSET()), '-05:00') AS [EST the hard way]

— so, thinking of a DATETIMEOFFSET data type as a complex object

— with many different parts: year, month, day, hour, time zone, etc.

— it looks like SWITCHOFFSET changes two things: time zone and hour

This was an interesting video. I typically think entirely in UTC and let the calling application convert to time zones as needed, but if that’s not an option for you, knowing about SWITCHOFFSET() is valuable.

Finding Current Session Options In SQL Server

Kevin Feasel

2019-01-29

T-SQL

Max Vernon has a helpful script to tell you if, for example, ANSI_NULLS is on:

SQL Server can be configured to provide certain behavior to client sessions, via the SET OPTIONS= command, or via the sys.sp_configure 'user options' system stored procedure. The SET OPTIONS= command only affects the current session, whereas the sys.sp_configure 'user options' system stored procedure configures the default values for all future user sessions.

Click through for the script, which even contains a quick description of each option.

SESSION_CONTEXT Is Case-Sensitive

Kevin Feasel

2019-01-28

T-SQL

Steve Jones notes an issue people may have when trying out the SESSION_CONTEXT() function in SQL Server:

Notice a difference? In the first query, I have SupplierID, but the second is SupplierId, with a lower case “d”. These keys are determined when you use sp_set_session_context, which takes a sysname value for the key. These are going to be case sensitive, as each one is a different identifier.

Click through for the full story. I’m not a fan of case-sensitivity in general and especially not in a system where, by default, most things are case-insensitive.

Calculating Median In SQL Server 2019

Tomaz Kastrun shows that batch aggregation mode on window functions allow PERCENTILE_CONT finally to become useful:

Next query, for median calculation was a window function query.


SELECT DISTINCT

PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY c1)

OVER (PARTITION BY (SELECT 1)) AS MedianCont

FROM t1


To my surprise, the performance was even worse, and at this time, I have to say, I was running this on SQL Server 2017 with CU7. But luckily, I had a SQL Server 2019 CTP 2.0 also installed and here, with no further optimization the query ran little over 1 second.

I’ve warned people away from this function for all but tiny data sets because of how poorly it performs. With SQL Server 2019, I might be able to recommend it.

Generating SQL With Biml

Cathrine Wilhelmsen shows us you can do a lot more with Biml than just generating SSIS packages:

This actually happened to me in a previous job. We had a fairly complex ETL solution for the most critical part of our Data Warehouse. Many SSIS packages, views, and stored procedures queried the tables that were replicas of the source system tables. One day, we found out that the source system would be deploying a new version of their database the following day. In every single table, some columns were removed, others added, and many changed data types.
Oh.
There was no way that we could manually update all our SSIS packages, views, and stored procedures in less than a day. Thousands of users depended on our solution. It was too late to pause the source system changes.
Oh.

That story ends up with a happy ending.

Rowcount Shenanigans When Deleting In Batches

Kevin Feasel

2019-01-09

T-SQL

Denis Gobo takes us through a few issues you might run into when deleting data in batches:

I have always used WHILE @@rowcount > 0 but you have to be careful because @@rowcount could be 0 when your while loop starts

Let’s take a look at an example. This is a simplified example without a where clause..but let’s say you have to delete several million rows from a table with many more millions of rows and the table is replicated… in that case you want to batch the deletes so that your log file doesn’t fill up, replication has a chance to catch up and in general the deletes should run faster

Click through for a couple of issues you might run into other than the obvious one of “I’m scanning the entire table with every delete” if you don’t have indexing set up right.

Finding Max Concurrent Operations With T-SQL

Kevin Feasel

2019-01-04

T-SQL

I have a post up showing how to calculate the maximum number of concurrent operations using T-SQL:

You can probably see by this point how the pieces are coming together:  each time frame has a starting point and an ending point.  If there were no overlap at all, we’d see in the fourth column a number followed by a NULL, followed by a number followed by a NULL, etc.  But we clearly don’t see that:  we see work item ordinals 3 and 4 share some overlap:  item 3 started at 3:06:15 PM and ended after item 4’s start of 3:07:20 PM.  This means that those two overlapped to some extent.  Then we see two NULL values, which means they both ended before 5 began.  So far so good for our developers!

Click through for a bunch of T-SQL scripts, images, and important advice about always having interns around to take the blame.

Building Test Data Following A Normal Distribution In T-SQL

I (finally) have a technical blog post:

In order to show you the solution, I want to build up a reasonable sized sample.  Any solution looks great when reading five records, but let’s kick that up a notch.  Or, more specifically, a million notches:  I’m going to use a CTE tally table and load 5 million rows.
I want some realistic looking data, so I’ve adapted Dallas Snider’s strategy to build a data set which approximates a normal distribution.
Because this is a little complicated, I wanted to take the time and explain the data load process in detail in its own post, and then apply it in the follow-up post.  We’ll start with a relatively small number of records for this demonstration:  50,000.  The reason is that you can generate 50K records almost instantaneously but once you start getting a couple orders of magnitude larger, things slow down some.

If you do custom data generation for lower environments, I’d recommend checking this out. Your production data probably doesn’t follow a normal distribution exactly, but a normal distribution is probably closer to reality than the uniform distribution you get with functions like RAND().

Categories

February 2019
MTWTFSS
« Jan  
 123
45678910
11121314151617
18192021222324
25262728