Press "Enter" to skip to content

Category: T-SQL

Regular Expression-Based String Splitting in SQL Server 2025

Aaron Bertrand splits a string:

SQL Server users have been asking for native regular expression support for over two decades. There are third-party Common Language Runtime (CLR) modules that offer this functionality, but these can be complicated to install and simply aren’t possible in some environments. I want to split a string using a regular expression instead of a static string. Will that be possible in SQL Server 2025, without CLR?

Must not rant about CLR. Must not rant about CLR. Must not rant about CLR. (By the way, if you ever catch me in person, get me going about how CLR got the short end of the stick and how the ‘modern’ forms of the Common Language Runtime in SQL Server are not great.)

Aaron tries out a function built into SQL Server that allows you to split strings into result sets using a regular expression to perform the splitting, and shows off some of the more complicated scenarios that this can solve over a normal STRING_SPLIT() function call.

Comments closed

Randomization of Personally Identifiable Information

Rich Benner tries out a couple of techniques:

The main issue we see in dev environments is that people take a nice little version of their database, a few hundred rows of data per table, and develop on that. This is great for checking that your logic is correct, but not good when it comes to actually deploying the code to production. Suddenly, your nice, pretty code has to deal with millions of rows of data and grinds to a halt because you didn’t write it for big data sets. No, you wrote it for your development system, and it was fast on your machine. We see this a lot.

Rich shows a couple of techniques for data randomization. My biggest challenge to this is that if you need a proper distribution of data, you lose it. Using the telephone number example, if you have lookups or data analysis by area code, randomly generating across every area code would be bad. Also, if your application is smart enough to deal with valid or invalid area codes and exchanges (the middle three digits of the three-three-four phone number style in the US), you generating arbitrary area codes or exchanges might prevent app developers from using the application in the proper way, perhaps requiring them to fix phone numbers after viewing a data entry screen.

In short, there are easier and harder ways to do this, and several factors may push you into the harder way.

Comments closed

A Deep Dive into IDENTITY Columns

Vlad Drumea performs a deep dive:

In SQL Server, IDENTITY is a column-level property that is used to provide an auto-incremented value for every new row inserted.

All you have to do is provide a seed value and an increment value when defining said column, and SQL Server will handle it from there.

Unlike sequences, identity columns do not require additional objects like default constraints or triggers to ensure the column is populated.

I’m glad that Vlad made a demo showing how @@IDENTITY works and how it can give you unexpected outputs if you’re not aware of a trigger working with a separate identity column. That one tends to get people.

Comments closed

The PRODUCT() Function in SQL Server 2025

Ed Pollack points out a new function:

With each version of SQL Server, there are always a few new features introduced that we applaud as we finally have access to a useful function that is already available elsewhere.

Introduced in SQL Server 2025 CTP 1.3, the PRODUCT() function acts similarly to SUM(), but multiplies values rather than adds them. It is an aggregate function in SQL Server and therefore operates on a data set, rather than on scalar values.

Ed notes that there are aggregate and window function versions of PRODUCT() and shows examples of how it works.

Comments closed

Grouping Sets in T-SQL

Erik Darling has a new video.

Erik mentions that he doesn’t often see GROUPING SETS in the wild. I’ve used them several times. And the use of the term “several times” probably gives you exactly the feeling that I intended. I really like grouping sets for very specific analytical system purposes (at least for moderate-sized datasets), so I’m glad that syntax is there. But outside of reporting queries, it’s a really uncommon bit of syntax.

Comments closed

Percentage Splits with Window Functions

Andy Brownsword breaks things up:

Sometimes you want to segment records. It may be splitting a customer base for marketing purposes, or segmenting a user base for a new feature. Good segmentation makes clean divisions in the data.

In this post we’ll see a way to achieve that with a great deal of help from Window Functions.

Click through for Andy’s motivation, which is a way that absolutely will not work the way you want it to.

Comments closed

Ways to Debug T-SQL Scripts

Simon Frazer shares some tips:

At some point, every SQL developer or DBA will need to debug T-SQL scripts, either to verify that they behave as expected or to track down the root cause of a problem. Whether you’re building something new or investigating a production issue, debugging is an essential part of the process.

There are several techniques available for troubleshooting, and it’s important to approach this differently depending on whether you’re working in a production or non-production environment. Each environment has its own risks and constraints.

Click through for Simon’s process. I also echo Simon’s sentiments at the end regarding the SSMS debugger—I know people who are passionate about it and mourn its passing, but I was never one of those people. It was far too easy to get in trouble with it, especially in shared environments.

Comments closed

Regular Expressions in SQL Server 2025

Ed Pollack digs into some new functionality:

String-searching in SQL Server has always been a mighty hassle. Balancing performance and horribly-complex queries is a compromise that no one enjoys. 

Generally speaking, a relational database is not an ideal place to search large amounts of text. Even when leveraging features such as Full-Text Indexing, the ability for an application to leverage speedy text-searching decreases as data becomes larger. If a service optimized for text-search can be used, such as Elasticsearch or Azure AI Search, then it will be far easier to deliver accurate results quickly. 

Ed focuses on the mechanisms available rather than performance, and that’s the current sticking point. Whether regular expression queries will get faster in subsequent CTPs or SQL Server 2025 RTM, we’ll see.

Comments closed