Press "Enter" to skip to content

Category: T-SQL

QUOTENAME and Dynamic SQL

Chris Johnson looks at one way to protect dynamic SQL statements:

I don’t have the usual aversion to dynamic SQL that you find a lot of developers have. I use it quite regularly as part of my ETL processes, usually to define how data will move from one stage to another. So, for instance, I might have a single Validate procedure that takes a number of parameters, including an import table name, and moves data from that import table to a valid table according to my validation rules. Or I might have a single SCDMerge procedure that takes a transformation view and merges that into a slowly changing dimension (although not using the actual MERGE function for various reasons). These procedures allow me to have confidence that data will always move from one stage to another in the same way, and saves me from writing essentially the same statement 50 times, and having to update it 50 times when I need to change the way we do something, and inevitably missing something and introducing some error.

This always feels like a pretty safe use of dynamic SQL to me, because it avoids some of the more common objections people like to raise to it:

Click through to see how QUOTENAME() can help sanitize user inputs. I personally prefer the route of using sp_executesql but QUOTENAME() can also do the trick.

Comments closed

Alternatives to GREATEST() and LEAST()

Mike Scalise has an alternative to using the GREATEST() and LEAST() functions in SQL Server:

As of 4/14/21, Microsoft has officially announced that the GREATEST and LEAST functions are in Azure SQL Database and SQL Managed Instance. Unofficially, it seems they had silently included these functions in at least (no pun intended) SQL Managed Instance several months prior. In any case, here we are today with official Microsoft documentation on GREATEST and LEAST. This is all great news. What’s also great is that, in their statement, Microsoft stated they would be including these two functions in the next version of SQL Server.

But what about all of us on SQL Server 2019 and prior? Fortunately, there’s a way to mimic these two functions in your queries using a correlated subquery in the SELECT clause.

Click through for examples. This is a bit different from getting the largest or smallest value in a window, which you can do with MIN(val) OVER () or MAX(val) OVER (). But I’m looking forward to seeing GREATEST() and LEAST() in the box product.

Comments closed

Another Number Series Generator Solution

Itzik Ben-Gan reviews Paul White’s solution to the number series generator challenge:

I love Paul White’s work. I keep being shocked by his discoveries, and wonder how the heck he figures out what he does. I also love his efficient and eloquent writing style. Often while reading his articles or posts, I shake my head and tell my wife, Lilach, that when I grow up, I want to be just like Paul.

When I originally posted the challenge, I was secretly hoping that Paul would post a solution. I knew that if he did, it would be a very special one. Well, he did, and it’s fascinating! It has excellent performance, and there’s quite a lot that you can learn from it. This article is dedicated to Paul’s solution.

This is an elite pair-up and well worth your time to review in detail.

Comments closed

Trickiness Around CASE Expressions

Edwin Sanchez takes us through a few issues you might run into with CASE expressions:

If you are trying to scan the subheadings to see what the issues are, I can’t blame you. Readers, including me, are impatient.

I trust that you already know the basics of SQL CASE, so, I won’t bore you with long introductions. Let’s dig into a deeper understanding of what’s happening under the hood.

Click through for the list.

Comments closed

Building a Better Random

David Fowler wants to generate a random number:

You see the problem? Well in the post that I mentioned, I adressed this issue and offered a solution using CHECKSUM() and NEWID().

That’s ok but can be a bit cumbersome. I recently stumbled across a function that I hadn’t realised was there before (and of course this is where you all tell me, we’ve known about that all the time), CRYPT_GEN_RANDOM().

My expectation with this was that CRYPT_GEN_RANDOM() would generate according to a uniform distribution. Performing a simple experiment with about 4 million entries against random numbers 0-255 (1 byte), I can confirm that this does approximate a uniform distribution.

Comments closed

Avoid the MERGE Operator in T-SQL

Cyndi Johnson has a warning for us:

The bugs alone are reasons enough to avoid using it, and also to refactor any existing MERGE statements you have in your code base. Those bugs include ones that have performance implications as well as ones that could lead to incorrect results. Long story short, Microsoft refuses to fix several of them, and some of the other major issues are “by design”!

If you still feel the desire to use them, please keep reading, and I hope by the end of this blog you will be convinced that

1. Less statements does not always lead to better code or better performance.
2. MERGE is like a VAMPIRE and once it is in your codebase, it’s just a matter of time before it drains your servers of their lifeblood.

The worst part about the second point is all the sparkling.

The worst part about the first point is that I was really excited with MERGE came out in 2008, as I wanted UPSERT for SQL Server. It, uh, didn’t work out so well.

Comments closed

COUNT(*) vs COUNT(1)

David Alcock tries an experiment:

The question whether to use COUNT(*) or COUNT(1) has been doing the rounds on the internet again, the question this time is whether or not COUNT(*) forces the optimiser to scan all rows or not.

I’m going to query Sales.Orders table of the WideWorldImporters sample database and use a where clause that should take advantage of a non-clustered index on CustomerID. 

David looks at COUNT(*) versus COUNT(1). The story’s slightly different with COUNT(ColumnName) when there are NULL values afoot.

Comments closed

More Number Series Generator Solutions

Itzik Ben-Gan continues a series on generating rows quickly:

This is the third part in a series about solutions to the number series generator challenge. In Part 1 I covered solutions that generate the rows on the fly. In Part 2 I covered solutions that query a physical base table that you prepopulate with rows. This month I’m going to focus on a fascinating technique that can be used to handle our challenge, but that also has interesting applications well beyond it. I’m not aware of an official name for the technique, but it is somewhat similar in concept to horizontal partition elimination, so I’ll refer to it informally as the horizontal unit elimination technique. The technique can have interesting positive performance benefits, but there are also caveats that you need to be aware of, where under certain conditions it can incur a performance penalty.

This is a great post on a rather complex topic.

Comments closed

Avoiding Division by Zero

Chad Callihan has a few methods for us to avoid dividing by zero:

In the real world, everyone knows that if you divide by zero a wormhole will open up and swallow the universe. In SQL Server, it’s not good, but it’s not nearly as dramatic. I encountered the following error this week and thought it would make a good topic:

Msg 8134, Level 16, State 1, Line 6
Divide by zero error encountered.

There are multiple ways to handle this error message in SQL Server and some are better than others. Let’s take a look at a few.

Click through for those methods and try to keep this universe existent—it’s where I keep all my stuff.

Comments closed

From SQL to Functional

Shel Burkow has the start of an interesting series:

Neither loops nor branches were used in any of the T-SQL work, but that doesn’t mean they aren’t there. They are – visible underneath in the query plans, which describe the process steps (operators and data flow) the SQL Server database engine follows to produce results. A salient example of looping is the Nested Loops operator used in the T-SQL rewrite query plan to match each outer row to an inner row, i.e. perform the INTERSECT. This style of coding, in which the logic of the computation is expressed rather than the lower-level control flow detail, is declarative programming, and is a major theme of this series. (The latter being imperative programming.)

This is a concept I’m really big on: I think functional programming languages are easier for data platform developers to pick up than object-oriented or imperative languages for the reason that both are declarative, so many of the programming metaphors for one apply to the other.

Comments closed