Testing Scalar UDF Performance In SQL Server 2019 CTP 2.1

Brian Hansen takes a look at the scalar UDF performance improvements in SQL Server 2019:

In one of my sessions, Set Me Up: How to Think in Sets, I discuss a variety of performance-inhibiting query constructs, including scalar UDFs. I thought it would be interesting to take the simple scalar function that I use in the demo and see what kind of difference that scalar inlining might make.

First, I restored the CorpDB database that I use in the session to my SQL Server 2019 CTP 2.1 instance and initially set the compatibility level to 140. I also ran script 001 from the demo to create the needed database tables (no need to create the CLR objects for this test). I then ran script 030 to execute the scalar UDF test. In a nutshell, this script

  • creates a UDF

  • runs a query that calls the UDF about 13,000 times, capturing the time required to do so

  • repeated this test five times

  • discards the fastest and slowest tests

  • reports the average time for the remaining three tests

If I’m reading Brian’s notes right, it’s still slower than writing the set-based solution yourself, but a huge improvement over the prior scalar function performance.

Adding Constraints In The CREATE TABLE Statement

Steve Jones shows how you can add constraints in your CREATE TABLE statement:

A good habit to get into is to explicitly name your constraints. I try to do this when I create tables to be sure that a) I have a PK and b) it’s named the same for all environments.

I can create a PK inline, with a simple table like this:

CREATE TABLE Batting
   (
        BattingKey INT NOT NULL CONSTRAINT BattingPK PRIMARY KEY
        , PlayerID INT
        , BattingDate DATETIME
        , AB TINYINT
        , H TINYINT
        , HR tinyint
   )
;

This gives a primary key, named “BattingPK, that I can easily see inline with the column.

Steve also gives an alternative formulation which works well for composite keys.  You can additionally add constraints after the create statement, but if you are creating temp tables and want to take advantage of temp table reuse, constraints have to be created as part of the table (and cannot have names).  For additional fun, since SQL Server 2014, you can create indexes as part of the CREATE TABLE statement as well—that was needed to create memory-optimized tables as back in that edition, you couldn’t add new indexes after the fact.

A Case When CASE Isn’t The Right Case

Adrian Buckman notes differences in the two ways of using CASE statements:

It looks so clean compared to the first example! but it wasn’t until I tested the second method out that I realised that the behaviour of the two CASE expressions are different as outlined on books online

The CASE expression has two formats:
The simple CASE expression compares an expression to a set of simple expressions to determine the result.
The searched CASE expression evaluates a set of Boolean expressions to determine the result.
Both formats support an optional ELSE argument.

I put together some examples to illustrate the difference when evaluating Null using the two Case expressions, the query returns the column ‘Databasename’ from the derived list values clause, example 1 has a Null value and example 2 has a value of ‘SQLUndercover’ which you will see below:

Adrian looks into a scenario in which the two CASE expressions return different results, and digs into execution plans to find out why.

DATEDIFF

Randolph West continues a series on covering dates and times, looking at DATEDIFF and DATEDIFF_BIG:

The only functional difference between them is that the DATEDIFF_BIG() returns values as a BIGINT, for results that exceed the boundary of an INT. Keep this in mind when deciding which one to use. For example, the maximum number of seconds an INT can hold is 68 years, while a BIGINT can comfortably store the number of seconds in 10,000 years. This becomes especially important when dealing with microseconds and nanoseconds.

The rest of the post will use DATEDIFF() to refer to both functions.

I think this might be the first time I’d read about DATEDIFF_BIG()and I’m not aware of ever having used it.  But hey, it could make sense if you need to track more than 2 billion microseconds.

Casting Constants And POWER()

Steve Jones walks us through a case of an unexpected error:

I ran into an interesting problem while working with the POWER() function. I was trying to do some binary conversions and had a statement like this to process powers of 2.

SELECT POWER(2, n)

This was designed to take a value and return a power of 2. I then used a different value to determine if this was added to my conversion factor or not. In trying to work with some larger numbers, I ran into this error:

Msg 232, Level 16, State 3, Line 3
Arithmetic overflow error for type int, value = 2147483648.000000.

Click through for a description of the problem as well as Steve’s solution.

Converting Binary To Hex With T-SQL

Kevin Feasel

2018-11-15

T-SQL

Dave Mason uses STRING_SPLIT to convert binary values to their hex equivalents:

I started pondering it for a bit and began to wonder if I could use the new for SQL Server 2016 STRING_SPLIT function to convert a binary string to decimal. The thought process was to split the string into rows of CHAR(1) values, along with an in-string character position. Then I could take the “1” values, calculate the 2ᵡ value, and sum them up for a decimal value.

I quickly realized I’d need to reverse the binary string. I also discovered the STRING_SPLIT function eliminated 0’s adjacent to 1’s. I don’t have much use for this code at the moment, but maybe there’s something here that you can take away.

Given some of the issues Dave ran into, it seems that a tally table-based solution might be a better choice.

Faster Scalar Functions In SQL Server 2019

Brent Ozar looks at improvements the SQL Server team has made to scalar functions in 2019:

My database has to be in 2019 compat mode to enable Froid, the function-inlining magic. Run the same query again, and the metrics are wildly different:

  • Runtime: 4 seconds

  • CPU time: 4 seconds

  • Logical reads: 3,247,991 (which still sounds bad, but bear with me)

My bias tells me that I still want to avoid scalar functions, but it’s no longer the automatic deal-killer it once was.

Uncovering Complexity In SQL Objects

Kevin Feasel

2018-10-29

T-SQL

Michael J. Swart helps us uncover hidden complexity in database objects:

The other day, Erin Stellato asked a question on twitter about the value of nested SPs. Here’s how I weighed in:

Hidden complexity has given me many problems in the past. SQL Server really really likes things simple and so it’s nice to be able to uncover that complexity. Andy Yun has tackled this problem for nested views with his sp_helpexpandview.

Click through for a script which helps.

Using Table-Valued Parameters In SQL Server

Ben Richardson has a post showing how to create user-defined table types and pass them into stored procedures:

Table-valued parameters were introduced in SQL Server 2008. Before that, there were limited options to pass tabular data to stored procedures. Most developers used one of the following methods:

  1. Data in multiple columns and rows was represented in the form of a series of parameters. However, the maximum number of parameters that can be passed to a SQL Server stored procedure is 2,100. Therefore, in the case of a large table, this method could not be used. Furthermore preprocessing is required on the server side in order to format the individual parameters into a tabular form.

  2. Create multiple SQL statements that can affect multiple rows, such as UPDATE. The statements can be sent to the server individually or in the batched form. Even if they are sent in the batched form, the statements are executed individually on the server.

  3. Another way is to use delimited strings or XML documents to bundle data from multiple rows and columns and then pass these text values to parameterized SQL statements or stored procedures. The drawback of this approach was that you needed to validate the data structure in order to unbundle the values.

The .NET framework then makes it easy to pass in an IEnumerable as a table-valued parameter.

Understanding ANY And ALL In SQL

Doug Kline explains the ANY and ALL operators in SQL:

-- note that this creates a single column of values
-- which could be used in something like IN
-- for example
SELECT 1
WHERE 12 IN ( SELECT tempField FROM (VALUES(11),(12),(7)) tempTable(tempField))
-- I could rephrase this as:
SELECT 1
WHERE 12 = ANY ( SELECT tempField FROM (VALUES(11),(12),(7)) tempTable(tempField))

I rarely see these operators in the wild and might have used them in production code a couple of times if that.

Categories

February 2019
MTWTFSS
« Jan  
 123
45678910
11121314151617
18192021222324
25262728