Press "Enter" to skip to content

Category: T-SQL

Double-Join Syntax in T-SQL

Greg Dodd unleashes a monster:

We now have all of the data, but the question is, how do we stop the Person’s name coming back if the HireDate is after 1 Jan 2000?

We can’t simply put it in a where clause, because then the Course Title won’t come back. We could put it as part of the join condition to Instructor, but that will only limit the hire date coming back. We could use that to build a case statement around the FirstName and LastName fields, but what other options are there?

What we want to do is limit the join to Person to only return if Instructor returns, and then we could put the join condition on Instructor.

I’ll admit that whenever I see this syntax, I tend to remove it and replace with a subquery or APPLY operation. I don’t like multi-joins at all for the reasons Greg mentions at the end: it’s uncommon and difficult for a T-SQL developer to parse mentally.

Comments closed

Writing Check Constraints in SQL Sever Data Tools

Chris Johnson has a how-to guide:

That all looks ok, and everything published fine, and kept on publishing fine until I happened to check the generated script for some of the publishes (I was in the early stages of development at this point so deployments were just being done through Visual Studio direct to my machine).

When I did I noticed that every deployment I was getting code like this:

That’s dropping all my constraints, recreating them using WITH NOCHECK, and then using WITH CHECK to check the existing data. Obviously all the data will pass those checks, but that’s going to add some time to my deployments, and as the size of the data and the number of checks both increase, these checks will take more and more time.

Read on to understand what’s happening. I’d call this a fairly silly limitation on the part of SSDT.

Comments closed

OFFSET and FETCH in SQL Server

Steve Jones explains how the OFFSET-FETCH process works:

The other day I saw an article on the OFFSET clause in a SELECT. I had seen this come out and looked at it briefly in SQL Server 2012, but hadn’t done much with it.

NOTE: if you use this, be sure you read about potential performance problems and solutions.

Read on for more info. This is something I really wanted to work better than it does, as paging is really awkward in SQL Server. OFFSET-FETCH is great syntactically, but doesn’t do much for performance. The best solution I’ve seen is to take the results of a paging operation and store them in a dedicated paging table, allowing the user to query by page in that small table quickly. But at that point, OFFSET-FETCH isn’t really much less complicated than TOP(NumberOfRows) WHERE NumberOfRows >= NumberOfRows * (NumberOfPages – 1) ORDER BY NumberOfRows.

Comments closed

QUOTENAME and Dynamic SQL

Chris Johnson looks at one way to protect dynamic SQL statements:

I don’t have the usual aversion to dynamic SQL that you find a lot of developers have. I use it quite regularly as part of my ETL processes, usually to define how data will move from one stage to another. So, for instance, I might have a single Validate procedure that takes a number of parameters, including an import table name, and moves data from that import table to a valid table according to my validation rules. Or I might have a single SCDMerge procedure that takes a transformation view and merges that into a slowly changing dimension (although not using the actual MERGE function for various reasons). These procedures allow me to have confidence that data will always move from one stage to another in the same way, and saves me from writing essentially the same statement 50 times, and having to update it 50 times when I need to change the way we do something, and inevitably missing something and introducing some error.

This always feels like a pretty safe use of dynamic SQL to me, because it avoids some of the more common objections people like to raise to it:

Click through to see how QUOTENAME() can help sanitize user inputs. I personally prefer the route of using sp_executesql but QUOTENAME() can also do the trick.

Comments closed

Alternatives to GREATEST() and LEAST()

Mike Scalise has an alternative to using the GREATEST() and LEAST() functions in SQL Server:

As of 4/14/21, Microsoft has officially announced that the GREATEST and LEAST functions are in Azure SQL Database and SQL Managed Instance. Unofficially, it seems they had silently included these functions in at least (no pun intended) SQL Managed Instance several months prior. In any case, here we are today with official Microsoft documentation on GREATEST and LEAST. This is all great news. What’s also great is that, in their statement, Microsoft stated they would be including these two functions in the next version of SQL Server.

But what about all of us on SQL Server 2019 and prior? Fortunately, there’s a way to mimic these two functions in your queries using a correlated subquery in the SELECT clause.

Click through for examples. This is a bit different from getting the largest or smallest value in a window, which you can do with MIN(val) OVER () or MAX(val) OVER (). But I’m looking forward to seeing GREATEST() and LEAST() in the box product.

Comments closed

Another Number Series Generator Solution

Itzik Ben-Gan reviews Paul White’s solution to the number series generator challenge:

I love Paul White’s work. I keep being shocked by his discoveries, and wonder how the heck he figures out what he does. I also love his efficient and eloquent writing style. Often while reading his articles or posts, I shake my head and tell my wife, Lilach, that when I grow up, I want to be just like Paul.

When I originally posted the challenge, I was secretly hoping that Paul would post a solution. I knew that if he did, it would be a very special one. Well, he did, and it’s fascinating! It has excellent performance, and there’s quite a lot that you can learn from it. This article is dedicated to Paul’s solution.

This is an elite pair-up and well worth your time to review in detail.

Comments closed

Trickiness Around CASE Expressions

Edwin Sanchez takes us through a few issues you might run into with CASE expressions:

If you are trying to scan the subheadings to see what the issues are, I can’t blame you. Readers, including me, are impatient.

I trust that you already know the basics of SQL CASE, so, I won’t bore you with long introductions. Let’s dig into a deeper understanding of what’s happening under the hood.

Click through for the list.

Comments closed

Building a Better Random

David Fowler wants to generate a random number:

You see the problem? Well in the post that I mentioned, I adressed this issue and offered a solution using CHECKSUM() and NEWID().

That’s ok but can be a bit cumbersome. I recently stumbled across a function that I hadn’t realised was there before (and of course this is where you all tell me, we’ve known about that all the time), CRYPT_GEN_RANDOM().

My expectation with this was that CRYPT_GEN_RANDOM() would generate according to a uniform distribution. Performing a simple experiment with about 4 million entries against random numbers 0-255 (1 byte), I can confirm that this does approximate a uniform distribution.

Comments closed

Avoid the MERGE Operator in T-SQL

Cyndi Johnson has a warning for us:

The bugs alone are reasons enough to avoid using it, and also to refactor any existing MERGE statements you have in your code base. Those bugs include ones that have performance implications as well as ones that could lead to incorrect results. Long story short, Microsoft refuses to fix several of them, and some of the other major issues are “by design”!

If you still feel the desire to use them, please keep reading, and I hope by the end of this blog you will be convinced that

1. Less statements does not always lead to better code or better performance.
2. MERGE is like a VAMPIRE and once it is in your codebase, it’s just a matter of time before it drains your servers of their lifeblood.

The worst part about the second point is all the sparkling.

The worst part about the first point is that I was really excited with MERGE came out in 2008, as I wanted UPSERT for SQL Server. It, uh, didn’t work out so well.

Comments closed

COUNT(*) vs COUNT(1)

David Alcock tries an experiment:

The question whether to use COUNT(*) or COUNT(1) has been doing the rounds on the internet again, the question this time is whether or not COUNT(*) forces the optimiser to scan all rows or not.

I’m going to query Sales.Orders table of the WideWorldImporters sample database and use a where clause that should take advantage of a non-clustered index on CustomerID. 

David looks at COUNT(*) versus COUNT(1). The story’s slightly different with COUNT(ColumnName) when there are NULL values afoot.

Comments closed