Press "Enter" to skip to content

Category: Syntax

Optional SUBSTRING() Length in SQL Server 2025

Louis Davidson points out a neat update:

Sometimes along comes a feature that seems so obvious, so natural, that you wonder why it took so long for Microsoft to implement it. One of those features in SQL Server 2005 is the optional length parameter in the SUBSTRING function. It has long been one of those questions when you wrote a SUBSTRING expression when you wanted to go from the Nth character to the end of the string, how many characters do you want? And for the most part, it didn’t really matter.

But sometimes it did (especially when dealing with nvarchar(max) data.

I learned about this when putting together an update to my Teaching Old Dogs New Tricks presentation. This capability is pretty nifty and something I wish I had a while ago.

Leave a Comment

TOP(1) with Ties

Andy Brownsword can’t stop at one:

Having TOP (1) return multiple rows feels wrong… but that’s what WITH TIES can do.

For a long time I used patterns like this to get the first record in a group:

Andy goes on to explain how WITH TIES works in T-SQL, shows an alternative to using a common table expression + window function to narrow down to the first logical group, and digs into when you might not want to use that alternative.

Leave a Comment

Performance Testing DATE_BUCKET()

Louis Davidson runs some tests:

A month and a half ago, I wrote a blog on using DATE_BUCKET. It is a cool feature thta makes doing some grouping really quite easy. It is here: Cool features in SQL Server I missed…DATE_BUCKET. One of the comments that came in was about performance of the DATE_BUCKET versus using things like DATEDIFF or a date table.

I started working on it then, but it got a bit involved (as performance comparison tests often do), so it took me a bit longer to get to than expected. But here it is, and the results are kind of what you would expect. The uses for DATE_BUCKET are really straightforward, and would rarely involve an index or a lot of filtering using the the function. But over a large number of rows, if it takes more time (even a millisecond more) than another method, you would notice it pretty quickly adding up.

Read on to see how DATE_BUCKET() performs compared to other methods of solving the same problem.

Leave a Comment

Idempotence in SQL Scripts

Jared Westover lays out some solid advice:

Imagine you’ve spent weeks preparing a software release for production, only to deploy it late one night and receive an error that the table or column already exists. This occurs in production environments, even when you use migration-based deployment methods such as DbUp. How can you ensure or at least reduce the likelihood of an error like this in the future?

At a prior job, we needed to write idempotent scripts because the deploy-to-QA process would run every script for the sprint every time someone checked in a new script. This prevented a few classes of release error, and I’ve carried that practice with me to future engagements.

SQL Server 2016 then gave us several helpers like CREATE OR ALTER for stored procedures and views, or DROP IF EXISTS for views and tables. It’s not a complete set of everything I’d like the functionality to do, but it’s a lot more convenient than what we had to do in prior versions.

Comments closed

Tracking the Last Sequence Value in SQL Server

Greg Low shares some queries and some history:

Sequences allow us to create a schema-bound object that is not associated with any specific table.

For example, if I have a Sales.HotelBookings table, a Sales.FlightBookings table, and a Sales.VehicleBookings table, I might want to have a common BookingID used as the key for each table. If more than the BookingID was involved, you could argue that there is a normalization problem with the tables, but we’ll leave that discussion for another day.

Another reason I like sequences is that they make it much easier to override the auto-generated value, without the need for code like SET IDENTITY_INSERT that we need with IDENTITY columns. This is particularly powerful if you ever need to do this across linked servers, as you’ll quickly find out that it doesn’t work.

Sequences let me avoid these types of issues: they perform identically to IDENTITY columns, and they also give me more control over the cache for available values.

Click through for some queries to find the latest value of a sequence, as well as how this functionality has changed over the years. One thing that I would point out is that, on busy systems, you might find that the value has changed between the time you run this query and the time you use the results.

Comments closed

Rolling Average Calculation via DATE_BUCKET()

Koen Verbeeck writes some code for SQL Server 2022 or later:

In the Microsoft Fabric Warehouse, a new T-SQL function was recently added: the DATE_BUCKET function. With this function, you can group dates into pre-defined buckets. This allows you to easily calculate aggregates that use the GROUP BY clause over these buckets, greatly simplifying the T-SQL statements for analytical use cases.

Click through for a demo. Koen mentions that this is also now available in the Microsoft Fabric Warehouse. Once you know how DATE_BUCKET() works, it’s pretty powerful. But I also think that the function is a bit confusing to use.

Comments closed

NTILE and Uneven Row Distribution

Jared Westover clarifies:

One of the simplest yet least-popular ranking functions in T-SQL is NTILE. It’s useful for dividing data into buckets or tiles. However, when your data isn’t evenly distributed across buckets, the results are confusing. Also, NTILE sometimes returns rows in a seemingly random order. What’s happening here?

There’s absolutely a pattern to how NTILE() works, as Jared describes.

Comments closed

The Real Ultimate Power of Omni-JOIN

Erik Darling coins a term:

You’ve got semi joins, and you’ve got anti-semi joins.

Everyone (looselyvery loosely, everyone) knows what those do: Find a match, or confirm there isn’t one. Lemon-squeezey.

The big question is, what do you call the style of join that requires a full enumeration on both sides? Erik tries out a series of ideas before landing on Omni-joins. I like the term.

1 Comment

Combining UNION and UNION ALL

Greg Low crosses the streams:

Until the other day though, I’d never stopped to think about what happens when you mix the two operations. I certainly wouldn’t write code like that myself but for example, without running the code (or reading further ahead yet), what would you expect the output of the following command to be? (Note: The real code read rows from a table but I’ve mocked it up with a VALUES clause to make it easier to see the outcome).

Read on to see what happens.

Comments closed

Creating Data from Literals in SQL Server

Louis Davidson has values. Many, many values:

Row Creators were introduced in SQL Server 2008, and allow you to create multiple rows in a single INSERT statement by using the VALUES clause. In this blog, I will demonstrate a few ways that we have created data in tables, and then show how you can do this with row constructors.

It’s not the only neat trick with VALUES(), either: you can also use CROSS APPLY and VALUES() to perform an efficient unpivot, turning a long virtual table into a wide virtual table.

Comments closed