A good habit to get into is to explicitly name your constraints. I try to do this when I create tables to be sure that a) I have a PK and b) it’s named the same for all environments.
I can create a PK inline, with a simple table like this:CREATE TABLE Batting ( BattingKey INT NOT NULL CONSTRAINT BattingPK PRIMARY KEY , PlayerID INT , BattingDate DATETIME , AB TINYINT , H TINYINT , HR tinyint ) ;
This gives a primary key, named “BattingPK, that I can easily see inline with the column.
Steve also gives an alternative formulation which works well for composite keys. You can additionally add constraints after the create statement, but if you are creating temp tables and want to take advantage of temp table reuse, constraints have to be created as part of the table (and cannot have names). For additional fun, since SQL Server 2014, you can create indexes as part of the CREATE TABLE statement as well—that was needed to create memory-optimized tables as back in that edition, you couldn’t add new indexes after the fact.
It looks so clean compared to the first example! but it wasn’t until I tested the second method out that I realised that the behaviour of the two CASE expressions are different as outlined on books online
The CASE expression has two formats:
The simple CASE expression compares an expression to a set of simple expressions to determine the result.
The searched CASE expression evaluates a set of Boolean expressions to determine the result.
Both formats support an optional ELSE argument.
I put together some examples to illustrate the difference when evaluating Null using the two Case expressions, the query returns the column ‘Databasename’ from the derived list values clause, example 1 has a Null value and example 2 has a value of ‘SQLUndercover’ which you will see below:
Adrian looks into a scenario in which the two CASE expressions return different results, and digs into execution plans to find out why.
The only functional difference between them is that the
DATEDIFF_BIG()returns values as a
BIGINT, for results that exceed the boundary of an
INT. Keep this in mind when deciding which one to use. For example, the maximum number of seconds an
INTcan hold is 68 years, while a
BIGINTcan comfortably store the number of seconds in 10,000 years. This becomes especially important when dealing with microseconds and nanoseconds.
The rest of the post will use
DATEDIFF()to refer to both functions.
I think this might be the first time I’d read about
DATEDIFF_BIG()and I’m not aware of ever having used it. But hey, it could make sense if you need to track more than 2 billion microseconds.
I ran into an interesting problem while working with the POWER() function. I was trying to do some binary conversions and had a statement like this to process powers of 2.SELECT POWER(2, n)
This was designed to take a value and return a power of 2. I then used a different value to determine if this was added to my conversion factor or not. In trying to work with some larger numbers, I ran into this error:
Msg 232, Level 16, State 3, Line 3
Arithmetic overflow error for type int, value = 2147483648.000000.
Click through for a description of the problem as well as Steve’s solution.
In SQL Server, or PostgreSQL for that matter, the TRUNCATE command is allowed in a transaction and it will commit or rollback like any other DML operation. In all of the scripts, below, I will do the following.
Check my row counts
Begin a transaction
TRUNCATE the table
INSERT one row
Check my row counts from within the transaction
Check to see if my row checks revert or not
That’s an important behavioral difference when loading data using a truncate-and-reload pattern.
There are certain design patterns in T-SQL that give me pause. They may not be “code smells” per se, but when I encounter them I find myself thinking “there’s got to be a more sensible way to accomplish this”. WAITFOR DELAY is one example. I’ve used it a few times here and there, mostly in one-off scripts. If I wrote some code that used it regularly in production, I’d be paranoid about putting my SPID to sleep forever. Maybe a little bit of paranoia is a good thing. But I digress.
A recent task found its way to me, and I’ve decided to use WAITFOR DELAY as part of my solution. (It hasn’t been tested or implemented yet–perhaps more on this in another post.) My usage this time has been more complex than in the past. What I already knew is that you can use a string literal for the time_to_pass argument. For example, this will delay for 3½ seconds:
WAITFOR DELAY '00:00:03.500'
Click through for a bunch of testing.
-- note that this creates a single column of values -- which could be used in something like IN -- for example SELECT 1 WHERE 12 IN ( SELECT tempField FROM (VALUES(11),(12),(7)) tempTable(tempField)) -- I could rephrase this as: SELECT 1 WHERE 12 = ANY ( SELECT tempField FROM (VALUES(11),(12),(7)) tempTable(tempField))
I rarely see these operators in the wild and might have used them in production code a couple of times if that.
Aaron Bertrand gives us the highlights:
Certificate Management in Config Manager View and validate all of your certificates from a single interface, and manage and deploy certificate changes across all of the replicas in an Availability Group or all of the nodes in a Failover Cluster Instance.
Built-in data classification A new
ADD SENSITIVITY CLASSIFICATIONstatement helps you identify and automatically audit sensitive data, a huge step up from the previous SSMS wizard (which just used extended properties).
Aaron also digs into the engine a bit:
This new aggregate function is designed for data warehouse scenarios, and is an equivalent for
COUNT(DISTINCT()). Instead of performing expensive distinct sort operations to determine actual counts, it relies instead on statistics to get something relatively accurate. You should find that the margin of error is within 2% of the precise count, 97% of the time, which is usually fine for high-level analytics, values that populate a dashboard, or quick estimates.
On my system I created a table with integer columns ranging from 100 to 1,000,000 unique values, and string columns ranging from 100 to 100,000 unique values. There were no indexes other than a clustered primary key on the leading integer column. Here are the results of
APPROX_COUNT_DISTINCT()against those columns, so you can see where it is off by a bit (but always well within 2%):
By the way,
APPROX_COUNT_DISTINCT() is a really good idea, and I’m glad it’s here.
An application developer came to me with this question recently: “Can I use the same column twice in a SQL UPDATE statement?”
Yes and no.
It depends on what you mean by “use”.
Read on to see what Doug means.
The time zone name is taken from a list maintained in the following Windows registry hive:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones
Note: For SQL Server on Linux (and Docker containers), a registry shim performs the same function as the Windows registry by intercepting the API calls and returning the expected value(s).
We can also use a Transact-SQL (T-SQL) query against the system view
sys.time_zone_info, which uses the information from the registry hive. This is the recommended method if you do not have access to the registry hive.
Click through for a couple of examples.