T-SQL – Page 14 – Curated SQL

One of the new features in SQL Server 2025 is that you can now use regular expressions directly in your T-SQL queries. Now, regular expressions (or RegEx) have never been a syntax that’s easy to read. There are a lot of brackets, dashes and other symbols that make no sense when you first see them. Before delving into how these can be used in SQL Server, a few basics are provided to get you started, along with a link to a website for further learning.

Read on for a quick primer and a bit of pain when it comes to performance.

Comments closed

Substring Search with Regular Expressions in SQL Server

Published 2025-09-11 by Kevin Feasel

Louis Davidson continues a series on regular expressions:

The REGEXP_SUBSTR function extracts parts of a string based on a regular expression pattern. It has some similarieties with the SUBSTRING function, but with some important (and interesting) differences. This function returns Nth occurrence of a substring that matches the regex pattern.

Read on to see how it compares to the traditional SUBSTRING() function.

Comments closed

K-Means Clustering in SQL Server

Published 2025-09-11 by Kevin Feasel

Sebastiao Pereira implements k-means clustering in T-SQL:

K-means clustering is an unsupervised machine learning algorithm used to group data into k distinct clusters based on their similarity, allowing for customer segmentation, anomaly detection, trend analysis, etc. The most common machine learning tutorials focus on Python or R. Normally, data is stored in SQL Server, and it is necessary to move data out of the database to apply clustering algorithms and then, if necessary, to update the original data with the cluster numbers. Is it possible to do it directly in SQL Server?

Given the work you have to do to implement this, I can’t imagine that it would be particularly fast. But it is neat to see that it’s possible.

Comments closed

Sparse Columns and Space Utilization

Published 2025-09-11 by Kevin Feasel

Steve Jones gins up a demo:

I saw this as a question submitted at SQL Server Central, and wasn’t sure it was correct, but when I checked, I was surprised. If you choose to designate columns as sparse, but you have a lot of data, you can use more space.

This post looks at how things are stored and the impact if much of your data isn’t null.

I consider sparse columns a relic of the mid-aughts era, when storage was a lot more expensive and compression was an Enterprise Edition-only feature. Given that you can use page compression in any edition of SQL Server nowadays, I don’t think there’s a viable reason ever to have a sparse column.

Also, definitely check out the comments, where Jeff Moden has a great one.

Comments closed

Window Functions in SQL Server

Published 2025-09-10 by Kevin Feasel

I have a new video:

In this video (part 1 of a new series), I explain what a window function is, as well as the components of window functions.

It’s taken a couple of months for me to get back on the video production wagon. This video serves mostly as the classroom primer for what will be primarily a demo-heavy series.

Comments closed

What Read Committed Guarantees

Published 2025-09-10 by Kevin Feasel

Erik Darling continues a series on transaction isolation levels. If you haven’t been watching this series, I’d recommend checking out his channel and catching up, as he’s spent a good amount of time on transaction isolation levels and what, exactly, that means in a relational database management system like SQL Server.

Comments closed

Batching Large Data Operations via Key Ranges

Published 2025-09-08 by Kevin Feasel

Andy Brownsword updates or deletes a batch of rows:

Effective batching in general helps us by:

Reduce transaction length and minimise blocking

Avoids unnecessary checking of the same rows repeatedly

Introduce graceful pacing to reduce impact on busy environments or data replication

I’m not the biggest fan of the OFFSET/FETCH combination there, at least if your key column is fairly well packed—like, say, 99+% of the rows are contiguous and you occasionally have a jump of a few thousand rows. Also, that batch size of 100K might be a little high, although that will certainly depend on what the operation is. Batch updating a column based on some fairly straightforward calculation? You can probably get away with 100K, though I’d still prefer 10K. But as you add more complexities (deleting rows, very high server throughput, triggers, limited hardware, etc.), that number should edge downward.

Comments closed

Splitting to a Table via Regular Expression

Published 2025-09-04 by Kevin Feasel

Louis Davidson creates a table:

Continuing on with the REGEXP_ functions series, the next one I want to cover is the table valued function REGEXP_SPLIT_TO_TABLE. This function is definitely one of the ones you probably ought to know, especially if you are ever tasked to pull some data out of a data structure.

This function is a lot like the STRING_SPLIT function, and unlike things like the REGEXP_LIKE function, you can basically use the same main parameters as you used in STRING_SPLIT for simple cases, but from there the possibilities are a lot more endless because you can define almost any delimiters you want. It isn’t perfect, because of a few things, but we will discuss that more later on.

Read on to see how it works, including one major caveat.

Comments closed

Date Intervals in PostgreSQL Window Functions

Published 2025-08-29 by Kevin Feasel

Hubert Lubaczewski solves a problem:

Since I can’t copy paste the text, I’ll try to write what I remember:

Given table sessions, with columns: user_id, login_time, and country_id, list all cases where single account logged to the system from more than one country within 2 hour time frame.

The idea behind is that it would be a tool to find hacked account, based on idea that you generally can’t change country within 2 hours. Which is somewhat true.

Solution in the blogpost suggested joining sessions table with itself, using some inequality condition. I think we can do better…

Click through for a solution that works for PostgreSQL but not SQL Server because the latter doesn’t offer date and time intervals on window function frames.

To do this in SQL Server, I’d probably use LAG() and get the prior value of country ID and the prior login time. Something like the following query, though I didn’t run detailed performance checks.

WITH records AS
(
	SELECT
		s.user_id,
		s.login_time,
		s.country_id,
		LAG(s.login_time) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_login_time,
		LAG(s.country_id) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_country_id
	FROM sessions s
)
SELECT *
FROM records r
WHERE
	r.prior_country_id <> r.country_id
	AND DATEDIFF(HOUR, r.prior_login_time, r.login_time) <= 2;

Comments closed

Replacing Text in SQL Server 2025 via Regular Expression

Published 2025-08-28 by Kevin Feasel

Louis Davidson continues a series on regular expressions in SQL Server 2025:

Okay, we have gone through as much of the RegEx filtering as I think is a a part of the SQL Server 2025 implementation. Now it is time to focus on the functions that are not REGEXP_LIKE. We have already talked about REGEXP_MATCHES, which will come in handy for the rest of the series.

I will start with REGEXP_REPLACE, which is like the typical SQL REPLACE function. But instead of replacing based on a static delimiter, it can be used to replace multiple (or a specific) value that matches the RegEx expression. All of my examples for this entry will simply use a variable with a value we are working on, so no need to create or load any objects.

Read on to see how it works, including plenty of examples.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: T-SQL

Performing Data Validation with RegEx in SQL Server 2025

Substring Search with Regular Expressions in SQL Server

K-Means Clustering in SQL Server

Sparse Columns and Space Utilization

Window Functions in SQL Server

What Read Committed Guarantees

Batching Large Data Operations via Key Ranges

Splitting to a Table via Regular Expression

Date Intervals in PostgreSQL Window Functions

Replacing Text in SQL Server 2025 via Regular Expression