Press "Enter" to skip to content

Category: Syntax

The Power of COALESCE()

Lukas Vileikis shows off a bit of ANSI SQL syntax:

When the need to deal with NULL values arises, multiple queries come onto the scene. The SQL COALESCE function is one of them. In simple terms, the SQL COALESCE function is a ‘fallback’ mechanism for missing data. Its only task is to return the first non-NULL value from a list of values.

I used to be a big believer in COALESCE() all of the time, but it turns out that ISNULL() is faster if you only have two things to compare. Granted, it’s not a huge difference in speed, as I recall, but the difference is there.

Leave a Comment

Avoid JOIN USING in SQL Scripts

Lukas Eder covers an esoteric bit of syntax:

Some SQL operators are as esoteric as they’re powerful. One of the oldest operator that you’ve likely hardly ever used in real world applications is NATURAL JOIN which is the default in relational algebra. We’ve covered a funky use-case for NATURAL JOIN earlier on this blog.

Click through for a similar example using JOIN USING. It’s best to be specific in your SQL queries, at least the permanent ones that you add to scripts you expect to re-run in the future or make part of applications.

Leave a Comment

PostgreSQL and Variables

Shaun Thomas shows off some functionality:

There’s been a kind of persistent myth regarding Postgres since I first started using it seriously over 20 years ago: “Postgres doesn’t support user variables.” This hasn’t really been true since version 8.0 way back in 2005. Part of this stems from the fact it doesn’t do things the same way as other common database engines.

Why don’t we spend a little time exploring the functionality that time forgot?

Reading through the post, I’m pretty happy with the way MySQL and SQL Server do it, even if SQL Server’s variables are batch-level and can be annoying if you want to maintain variable state across batches. In that case, you’d typically use a user-defined table type or temp table to store the values and re-instate them in the next batch.

Leave a Comment

Window Functions for Developers

Jamal Hansen provides an overview of window functions:

So let’s dive in. What problem do window functions solve? They seem to do aggregation-type activities, can’t GROUP BY do this?

The short answer is, not very well.

Window functions allow you to do things like calculate running totals, rankings, and moving averages, which tend to be very difficult to do otherwise because it requires a ‘window’ into a subset of the data.

Click through for the explanation, a primer on some of the types of window functions that are available, and several examples. I noticed that the code is impossible to read in light mode on the website, so either highlight it or go into dark mode, I guess.

Comments closed

Splitting to Table via STRING_SPLIT() and REGEX_SPLIT_TO_TABLE()

Greg Low might have violated Betteridge’s Law of Headlines:

Using T-SQL it’s quite easy to build a table-valued function that can step through a string, character-by-character, and (based on a delimiter) output the delimited strings. The problem is that the performance of these functions is appalling.

When XML support appeared in SQL Server 2005, there were many attempts to create more efficient string-splitting functions. For many strings, these work quite well, but do have a few oddities that you need to cope with. Plus, most have limitations on the strings that you can split.

Ultimately, what was really needed was an efficient and native built-in function.  

Greg points out two mechanisms and contrasts them.

Comments closed

Optional SUBSTRING() Length in SQL Server 2025

Louis Davidson points out a neat update:

Sometimes along comes a feature that seems so obvious, so natural, that you wonder why it took so long for Microsoft to implement it. One of those features in SQL Server 2005 is the optional length parameter in the SUBSTRING function. It has long been one of those questions when you wrote a SUBSTRING expression when you wanted to go from the Nth character to the end of the string, how many characters do you want? And for the most part, it didn’t really matter.

But sometimes it did (especially when dealing with nvarchar(max) data.

I learned about this when putting together an update to my Teaching Old Dogs New Tricks presentation. This capability is pretty nifty and something I wish I had a while ago.

Comments closed

TOP(1) with Ties

Andy Brownsword can’t stop at one:

Having TOP (1) return multiple rows feels wrong… but that’s what WITH TIES can do.

For a long time I used patterns like this to get the first record in a group:

Andy goes on to explain how WITH TIES works in T-SQL, shows an alternative to using a common table expression + window function to narrow down to the first logical group, and digs into when you might not want to use that alternative.

Comments closed

Performance Testing DATE_BUCKET()

Louis Davidson runs some tests:

A month and a half ago, I wrote a blog on using DATE_BUCKET. It is a cool feature thta makes doing some grouping really quite easy. It is here: Cool features in SQL Server I missed…DATE_BUCKET. One of the comments that came in was about performance of the DATE_BUCKET versus using things like DATEDIFF or a date table.

I started working on it then, but it got a bit involved (as performance comparison tests often do), so it took me a bit longer to get to than expected. But here it is, and the results are kind of what you would expect. The uses for DATE_BUCKET are really straightforward, and would rarely involve an index or a lot of filtering using the the function. But over a large number of rows, if it takes more time (even a millisecond more) than another method, you would notice it pretty quickly adding up.

Read on to see how DATE_BUCKET() performs compared to other methods of solving the same problem.

Comments closed

Idempotence in SQL Scripts

Jared Westover lays out some solid advice:

Imagine you’ve spent weeks preparing a software release for production, only to deploy it late one night and receive an error that the table or column already exists. This occurs in production environments, even when you use migration-based deployment methods such as DbUp. How can you ensure or at least reduce the likelihood of an error like this in the future?

At a prior job, we needed to write idempotent scripts because the deploy-to-QA process would run every script for the sprint every time someone checked in a new script. This prevented a few classes of release error, and I’ve carried that practice with me to future engagements.

SQL Server 2016 then gave us several helpers like CREATE OR ALTER for stored procedures and views, or DROP IF EXISTS for views and tables. It’s not a complete set of everything I’d like the functionality to do, but it’s a lot more convenient than what we had to do in prior versions.

Comments closed

Tracking the Last Sequence Value in SQL Server

Greg Low shares some queries and some history:

Sequences allow us to create a schema-bound object that is not associated with any specific table.

For example, if I have a Sales.HotelBookings table, a Sales.FlightBookings table, and a Sales.VehicleBookings table, I might want to have a common BookingID used as the key for each table. If more than the BookingID was involved, you could argue that there is a normalization problem with the tables, but we’ll leave that discussion for another day.

Another reason I like sequences is that they make it much easier to override the auto-generated value, without the need for code like SET IDENTITY_INSERT that we need with IDENTITY columns. This is particularly powerful if you ever need to do this across linked servers, as you’ll quickly find out that it doesn’t work.

Sequences let me avoid these types of issues: they perform identically to IDENTITY columns, and they also give me more control over the cache for available values.

Click through for some queries to find the latest value of a sequence, as well as how this functionality has changed over the years. One thing that I would point out is that, on busy systems, you might find that the value has changed between the time you run this query and the time you use the results.

Comments closed