Press "Enter" to skip to content

Category: T-SQL

Tracking the Last Sequence Value in SQL Server

Greg Low shares some queries and some history:

Sequences allow us to create a schema-bound object that is not associated with any specific table.

For example, if I have a Sales.HotelBookings table, a Sales.FlightBookings table, and a Sales.VehicleBookings table, I might want to have a common BookingID used as the key for each table. If more than the BookingID was involved, you could argue that there is a normalization problem with the tables, but we’ll leave that discussion for another day.

Another reason I like sequences is that they make it much easier to override the auto-generated value, without the need for code like SET IDENTITY_INSERT that we need with IDENTITY columns. This is particularly powerful if you ever need to do this across linked servers, as you’ll quickly find out that it doesn’t work.

Sequences let me avoid these types of issues: they perform identically to IDENTITY columns, and they also give me more control over the cache for available values.

Click through for some queries to find the latest value of a sequence, as well as how this functionality has changed over the years. One thing that I would point out is that, on busy systems, you might find that the value has changed between the time you run this query and the time you use the results.

Comments closed

Binary Search for Chronological Records in SQL Server

Andy Brownsword performs several probes:

Specifically we’ll use a binary search approach to narrow the search range. We abuse the correlation between the clustering key and timestamp to zero in on the records, using the key for navigation, and the timestamp to guide us.

We’ll start with the first and last records as boundaries, followed by checking the timestamp at the mid-point. Depending on whether the timestamp is before or after our target point in time, the appropriate boundary is moved. This halves the key space, and the search repeats until we’ve narrowed the range sufficiently to scan a very small portion of records.

It’s a neat idea, though do watch for Andy’s warning at the end.

Comments closed

Rolling Average Calculation via DATE_BUCKET()

Koen Verbeeck writes some code for SQL Server 2022 or later:

In the Microsoft Fabric Warehouse, a new T-SQL function was recently added: the DATE_BUCKET function. With this function, you can group dates into pre-defined buckets. This allows you to easily calculate aggregates that use the GROUP BY clause over these buckets, greatly simplifying the T-SQL statements for analytical use cases.

Click through for a demo. Koen mentions that this is also now available in the Microsoft Fabric Warehouse. Once you know how DATE_BUCKET() works, it’s pretty powerful. But I also think that the function is a bit confusing to use.

Comments closed

Testing Implicit Conversion and Performance in SQL Server

Louis Davidson runs some tests:

If you have ever done any performance tuning of queries in SQL Server, no doubt one of the first thing you have heard is that your search argument data types need to match the columns that you are querying. Not one thing in this blog is going to dispute that. Again, the BEST case is that if your column is an nvarchar, your search string matches that column datatype. But why is this? I will do my best to make this pretty clear, especially why it doesn’t always matter.

Read on as Louis lays out the explanation.

Comments closed

NTILE and Uneven Row Distribution

Jared Westover clarifies:

One of the simplest yet least-popular ranking functions in T-SQL is NTILE. It’s useful for dividing data into buckets or tiles. However, when your data isn’t evenly distributed across buckets, the results are confusing. Also, NTILE sometimes returns rows in a seemingly random order. What’s happening here?

There’s absolutely a pattern to how NTILE() works, as Jared describes.

Comments closed

Implementing Shamir’s Secret Sharing in SQL Server

Sebastiao Pereira implements an algorithm:

Shamir’s Secret Sharing is a cryptographic algorithm that allows a secret to be split into multiple components and shared among a group in such a way that the secret can only be revealed if a minimum number of components are combined. Is it possible to have this algorithm implemented in SQL Server without using external tools?

Click through for a T-SQL implementation, as well as one using CLR.

Comments closed

Combining UNION and UNION ALL

Greg Low crosses the streams:

Until the other day though, I’d never stopped to think about what happens when you mix the two operations. I certainly wouldn’t write code like that myself but for example, without running the code (or reading further ahead yet), what would you expect the output of the following command to be? (Note: The real code read rows from a table but I’ve mocked it up with a VALUES clause to make it easier to see the outcome).

Read on to see what happens.

Comments closed

Creating Data from Literals in SQL Server

Louis Davidson has values. Many, many values:

Row Creators were introduced in SQL Server 2008, and allow you to create multiple rows in a single INSERT statement by using the VALUES clause. In this blog, I will demonstrate a few ways that we have created data in tables, and then show how you can do this with row constructors.

It’s not the only neat trick with VALUES(), either: you can also use CROSS APPLY and VALUES() to perform an efficient unpivot, turning a long virtual table into a wide virtual table.

Comments closed

Running Totals over Arbitrary Date Ranges

Louis Davidson solves an interval problem:

Say you want to find the most recent 30-day period during which a person purchased some amount of products from your company. How you market to a customer might change if they have been active over a time period recently, or even in the past. But this also means that for each day going back in history, you need to sum historic data over and over, and the previous 29 days of activity. This is generally known as a rolling total. Doing this sort of calculation has been an interesting problem for many years.

When window functions came around, they became quite useful for such tasks, but they have one kind of complicated problem: gaps in source data patterns.

Funnily enough, there is a solution using window functions: range intervals. The ANSI SQL definition for RANGE (versus ROWS) for window functions does allow for the specification of a date range, like RANGE BETWEEN INTERVAL '30' DAY PRECEDING AND CURRENT ROW. Very impressive.

Unfortunately, SQL Server doesn’t support these. PostgreSQL does, but it’s an area I’ve agitated about for a few years and I do hope that someday, the SQL Server product team will support this functionality. In the meantime, Louis has a solution that works well for the task.

Comments closed

SELECT * in EXISTS Redux

Louis Davidson follows up from a prior post:

For example, it is often said that SELECT * makes your queries slower. In a nuanced way, this is often true, but only if changes occur in the database where columns are added. So many readers (myself included) see something that is demonstrably not 100% being treated as such, and they tune out.

There are plenty of other reasons you shouldn’t use that construct, no matter what.

In this post, I want to admit to having my mind changed, and I will go back and change the previous post.

One thing I really appreciate about Louis is his willingness to listen to new information, update his priors, and outright say “Hey, here’s what I thought before and now I believe this instead.” That’s a commendable trait.

Comments closed