Press "Enter" to skip to content

Category: T-SQL

Text Features in SQL Server 2025

Tomaz Kastrun continues an advent of SQL Server 2025. Day 22 looks at the UNISTR() function:

UNISTR() function is a new T-SQL function in SQL Server 2025. It will help you with unicode string literals (e.g.: special characters, emoji, special language alphabets and others) by letting you specify the unicode encoding value of characters in the string.

Difference between NCHAR and UNISTR is that latter will provide more flexibility and ways of handling multiple unicode characters and even escape sequences. You can also define a custom escape character to perform the necessary conversion of Unicode values into a string character set.

Day 23 looks at a new way of concatenating and compound assigning:

Two new features are available in SQL Server 2025 for string operations; both for string concatenation.

The || and ||= combo are basically + and += for string, but it brings T-SQL in alignment with ANSI SQL. I’d still recommend using functions like CONCAT() for NULL-safety, or CONCAT_WS() for NULL-safety plus automatic separator addition, but it does fix a longer-standing pain point around platform compatibility.

Leave a Comment

The Impact of Sorting and Filters on Pagination

Aaron Bertrand continues digging into SQL Server pagination performance:

In my previous tip, Pagination Performance in SQL Server, I showed how to make SQL pagination more predictable – turning O(n) into O(1). I materialized and cached row numbers to page through instead of calculating them on every request. It wasn’t the whole story, though; real pagination queries rarely get to sort without filtering. Users always want more control, and filtering can threaten that predictability.

Read on for examples of how to handle a few different scenarios.

Leave a Comment

Consistent Pagination Performance in SQL Server

Aaron Bertrand takes life one page at a time:

Many web applications and APIs implement pagination to control how much data is returned or displayed. Many paging solutions suffer from the linear scaling problem (often referred to as O(n)), meaning the amount of work increases as you get into higher page numbers. If a user has ever clicked “next page” or “last page” and your CPUs caught on fire, you may have been a victim of linear scaling. Are there any creative solutions that will achieve constant-time performance (O(1))?

Aaron’s answer is interesting, particularly if you’re able to define the valid set of filters. At a prior job, I was responsible for filtering of arbitrary combinations of 30+ different columns across multiple dimensions and a fact table in a warehouse. That was a royal pain. The best we could do was run the query once, using ROW_NUMBER() to capture the sort order, and then store that ordering in a specialized table with an identifier token that was a hash of the incoming session info, and cache that data for a pre-set amount of time—which, if I remember correctly, was 5 minutes. Somewhat similar to what Aaron shows but much more ephemeral and it caused the first load to be consistently slower while making subsequent paging activities much faster.

Leave a Comment

Contrasting ISNULL() versus COALESCE() Performance

Andy Brownsword takes a peek:

When eliminating NULL values with SQL Server queries we typically reach for ISNULL or COALESCE to do the job. Generally speaking they’ll provide equivalent results, but they work in different ways. If you’re dealing with logic more complex than ISNULL(ColA, ColB) – for example using a function or subquery as part of the expression – then you might be in for a surprise.

The content of expressions when evaluating NULL values can have big implications on query performance. In this post we’ll look at how the functions work and the implications they can have when evaluating NULL values.

Read on for the performance showdown.

Leave a Comment

DATE_BUCKET() Now GA in Fabric Data Warehouse

Jovan Popovic makes an announcement:

We have introduced a new DATE_BUCKET() function in Fabric Data Warehouse SQL language that makes reporting and analytics even easier.

In this blog post, you’ll discover how it simplifies time-based reporting and makes grouping dates effortless.

My experience is that DATE_BUCKET() takes a bit of effort getting used to, as it is not an intuitive function. That said, it can be really powerful for dealing with time series data. It is also available in SQL Server, as of SQL Server 2022.

Leave a Comment

Gaps in Identity Columns

Brent Ozar explains why there can be gaps in identity columns:

And you use that identity number for invoice numbers, or customer numbers, or whatever, and you expect that every single number will be taken up. For example, your accounting team might say, “We see order ID 1, 3, and 4 – what happened to order ID #2? Did someone take cash money from a customer, print up an invoice for them, and then delete the invoice and pocket the cash?”

Well, that might have happened. But there are all kinds of reasons why we’d have a gap in identities. One of the most common is failed or rolled-back transactions. To illustrate it, let’s start a transaction in one window:

I have a talk on applying forensic accounting techniques using SQL and Python (as well as an older version using R) and this is one of the things I bring up. In cases where you absolutely need contiguous numbers, the best I can do for you is no identity column and a stored procedure that runs in a SERIALIZED transaction isolation level, using an app lock to prevent anybody else from calling the stored procedure concurrently, taking a table lock out on the relevant table prior to doing any real work, and hard blocking everybody else until your transaction either succeeds or fails. And I’m not even 100% sure on that if you have enough concurrency to matter.

Leave a Comment

Refactoring SQL Code

Steve Jones shares some thoughts:

I was thinking about this when I saw this article on strategies to refactor sql code. The article seems written more for PostgreSQL, but there are items that relate to T-SQL as well. The main thrust of the article is about trying to rewrite code to DRY (don’t repeat yourself). The more changes you can make to shrink code, either to make it easier to read or avoid repeating those copy/paste items, the better off your team will be. It’s easy to think those copies aren’t a big deal, but it’s easy to update code in one place because that solves the problem you were given, and forget to fix all the copies.

Strict refactoring—leaving the inputs and outputs alone and only modifying the structure of code beyond the scope of reformatting but without changing its behavior—is somewhat uncommon in T-SQL outside of performance tuning scenarios, at least in my experience. The problem I have with DRY, when it comes to T-SQL, is that you generally need to pay the performance piper. Yes, repeating the contents of a common function in a series of T-SQL queries is repetition and “wasteful” in that regard, but if it makes the queries run literally 3-9x faster just from making these changes, I don’t care. I’ll do it.

If T-SQL were an idealized implementation of a fourth-generation language, where all viable equivalent queries would have the same execution plan and thus the same performance, then we’d see a lot more code refactoring because the way we write the code would not have a direct impact on how it runs. But in practice, that’s not the case.

Leave a Comment

RegEx Performance in SQL Server 2025

Brent Ozar has an update:

Back in March 2025 when Microsoft first announced that REGEX support was coming to SQL Server 2025 and Azure SQL DB, I gave it a quick test, and the performance was horrific. It was bad in 3 different ways:

  1. The CPU usage was terrible, burning 60 seconds of CPU time to check a few million rows
  2. It refused to use an index
  3. The cardinality estimation was terrible, hard-coded to 30% of the table

Read on to see what has changed. It’s obviously not perfect, but just as obviously is much better than what Brent saw in Azure SQL DB at the time.

Leave a Comment

Regular Expression Functions in SQL Server 2025

Tomaz Kastrun continues an advent of SQL Server 2025. Day 8 looks at a pair of regular expression-related functions:

Continuing with SQL Server 2025 T-SQL functions for Regular Expressions for in string and count functionalities.

And Day 9 hits two more:

Last two functions in the family of new T-SQL functions that were shipped with RegEx, are REGEXP_MATCHES() and REGEXP_SPLIT_TO_TABLE().

Read on to see how all four of these work.

Leave a Comment

Generating Shape-Bound Random Points in SQL Server

Sebastiao Pereira generates some numbers:

Random number generation is vital in computer science, supporting fields like optimization, simulation, robotics, and gaming. The quality, speed, and sometimes security of the generator can directly affect an algorithm’s correctness, performance, and competitiveness. In Python, random number generation is well-supported and widely used. In this article, we will look how to we can use SQL to do this.

Click through for several examples.

Leave a Comment