There are times when I am working on a SQL script that really shouldn’t be executed all at once. Sometimes it’s a series of examples / demos for a presentation or forum answer. Other times it’s just a temporary need while I’m in the process of creating a complex script, but once the script is completed and tested then it should run all at once. In either case, I have accidentally hit F5 too many times when I thought that a certain section of code was highlighted (so only that section would execute) but in fact nothing was highlighted so the script started executing from the very top, and either ran until completion or until I was able to cancel it (if it ran long enough for me to have time to understand what was happening and hit the “cancel” button).
So I needed some way of ensuring that a script would not execute if no section was highlighted.
Read on to learn about
I’m starting a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.
In the first part: the GENERATOR function. In short, it lets you generate a virtual table with a specified number of rows, or in database lingo: a tally table (or numbers table). A tally table can help you solve a lot of problems in SQL, but the TL;DR version is that it replaces loops/cursors most of the time and allows you to tackle the issue in a true set-based manner.
Naturally, as I read the article, I got the Bad Religion song stuck in my head. That’s an occupational hazard, I suppose.
Now, looking at it a bit more closely you’ll see that this is a function call, not just a property. Now, in my research for this post I did find where I’d mentioned this function briefly in my somewhat comprehensive identity post. Technically I didn’t mention so much as it was mentioned to me in the comments so I added it to the list. I guess I either didn’t look at it closely enough at the time or it’s just one of those cases where I forgot. Either way, it’s worth highlighting now.
Click through to learn more.
There are times when we need to compare two tables and figure out if the data matches. I often see a number of ways of doing this suggested, most are quite slow and inefficient. I’d quite like to share a quick and slightly dirty way of doing this using the CHECKSUM and CHECKSUM_AGG functions.
Just a reminder that CHECKSUM() will generate a checksum for an entire row or selection of columns in the row.
Will generate a checksum for a dataset.
David then has a couple of examples showing these in action.
Since I was working on an example for my next performance class, I decided to use an example from that session. In the query below the qp.* part of the column list will return a column named query_plan which is an XML data type.
SELECT DISTINCT pa.,cp.,qp.*
FROM sys.dm_exec_cached_plans) cp
CROSS APPLY sys.dm_exec_query_plan (cp.plan_handle ) qp
CROSS APPLY sys.dm_exec_plan_attributes(cp.plan_handle) As pa
WHERE is_cache_key = 1 AND pa.attribute = ‘set_options’
However, when I add the DISTINCT keyword I get the error below.
Read on for a workaround for this.
There are a whole bunch of limitations in creating indexed views. One of them is that you can’t base the query on DISTINCT.
Fair enough, but you can do GROUP BY.
And what’s pretty cool is that the optimizer can match a query written to find distinct values to an indexed view with a group by.
Click through for the best example ever.
A good habit to get into is to explicitly name your constraints. I try to do this when I create tables to be sure that a) I have a PK and b) it’s named the same for all environments.
I can create a PK inline, with a simple table like this:CREATE TABLE Batting ( BattingKey INT NOT NULL CONSTRAINT BattingPK PRIMARY KEY , PlayerID INT , BattingDate DATETIME , AB TINYINT , H TINYINT , HR tinyint ) ;
This gives a primary key, named “BattingPK, that I can easily see inline with the column.
Steve also gives an alternative formulation which works well for composite keys. You can additionally add constraints after the create statement, but if you are creating temp tables and want to take advantage of temp table reuse, constraints have to be created as part of the table (and cannot have names). For additional fun, since SQL Server 2014, you can create indexes as part of the CREATE TABLE statement as well—that was needed to create memory-optimized tables as back in that edition, you couldn’t add new indexes after the fact.
It looks so clean compared to the first example! but it wasn’t until I tested the second method out that I realised that the behaviour of the two CASE expressions are different as outlined on books online
The CASE expression has two formats:
The simple CASE expression compares an expression to a set of simple expressions to determine the result.
The searched CASE expression evaluates a set of Boolean expressions to determine the result.
Both formats support an optional ELSE argument.
I put together some examples to illustrate the difference when evaluating Null using the two Case expressions, the query returns the column ‘Databasename’ from the derived list values clause, example 1 has a Null value and example 2 has a value of ‘SQLUndercover’ which you will see below:
Adrian looks into a scenario in which the two CASE expressions return different results, and digs into execution plans to find out why.
The only functional difference between them is that the
DATEDIFF_BIG()returns values as a
BIGINT, for results that exceed the boundary of an
INT. Keep this in mind when deciding which one to use. For example, the maximum number of seconds an
INTcan hold is 68 years, while a
BIGINTcan comfortably store the number of seconds in 10,000 years. This becomes especially important when dealing with microseconds and nanoseconds.
The rest of the post will use
DATEDIFF()to refer to both functions.
I think this might be the first time I’d read about
DATEDIFF_BIG()and I’m not aware of ever having used it. But hey, it could make sense if you need to track more than 2 billion microseconds.
I ran into an interesting problem while working with the POWER() function. I was trying to do some binary conversions and had a statement like this to process powers of 2.SELECT POWER(2, n)
This was designed to take a value and return a power of 2. I then used a different value to determine if this was added to my conversion factor or not. In trying to work with some larger numbers, I ran into this error:
Msg 232, Level 16, State 3, Line 3
Arithmetic overflow error for type int, value = 2147483648.000000.
Click through for a description of the problem as well as Steve’s solution.