Press "Enter" to skip to content

Category: T-SQL

Named Groups in T-SQL Regular Expressions

Andy Brownsword digs into a nice capability around using regular expressions:

Now that we’re on the cusp of adoption within SQL Server, it’ll be a valuable tool there too. However, after trying it out last week, one omission stood out – one of my favourite features for string parsing: Named Groups.

Where a usual expression can be used to match a string, Named Groups can also be used to extract details from the string. Using an example from Steve’s invitation where a PO number 20260720321433 begins with a year/month and then a number, this could be split with named groups:

Andy mentions the community displeasure for CLR and that displeasure annoys me to no end. I think 90% of the hysteria around CLR in SQL Server was a misunderstanding in terms and unwillingness to learn other programming languages. If you ever catch me in person, I’ll rant about it at length.

Comments closed

HTML Parsing in T-SQL

Louis Davidson goes looking for list items:

From the title of “Favorite String Parsing”, I will say 100% it is using SQL Server 2025’s addition of Regular Expressions. Previously, parsing text in SQL Server was one of my least favorite things to do. Regular expressions will make it just a bit nicer, because it has a lot more power than SUBSTRINGLEFTRIGHT, and CHARINDEX/PATINDEX. All generally “good enough” functions for a lot of the things you need to do, but often woefully inadequate for parsing large amounts of text.

Yeah, T-SQL RegEx is definitely a nicer approach, though HTML doesn’t have to follow the consistency rules of XML due to browsers being very forgiving in their interpretation of the language, so it’s easy to get tangled up trying to parse websites.

Comments closed

String Parsing in T-SQL

Rob Farley shares some thoughts:

But let’s talk about non-regex methods for parsing strings and the patterns that I use. I find that the biggest issue with most string parsing is complexity. Even something as simple as finding the value between the 2nd and 3rd hyphens can be done in different ways with different levels of complexity, and even if it works, maintaining that code can become really hard.

For example, finding the position of the first hyphen might be as simple as using the CHARINDEX function. Finding the second might involve two CHARINDEX functions, and calling SUBSTRING with parameters that have increasingly nested CHARINDEX calls… well, you can see how the complexity quickly builds

Rob digs into one of my favorite use cases for the APPLY operator: simplifying calculations, or in this case simplifying expression chains. Granted, I have also grown to appreciate the DuckDB solution of allowing for function chaining. The demo examples in that documentation are limited but you can do things like goose_name.lower().replace('goose', 'duck').replace(' ', '') and it will work fine.

Comments closed

Concatenating String Values in SQL Server 2025

Louis Davidson plays around with the new string concatenation operator:

I was looking around for stuff to post about the other day, so I read through the latest What’s new in SQL Server 2025 and I found this:

|| (String concatenation) Concatenate expressions with expression || expression.

My interest was piqued to say the least. Using + to concatenate string data has always had its issues, and as much as I love the CONCAT function, it is kind of clunky to use versus a proper operator.

This is part of the ANSI SQL standard, and that standard does specify most of its behavioral patterns. Read on as Louis tries it out in a variety of circumstances.

Comments closed

Tuning Window Functions in SQL Server

I have a new video:

In this video, I show you various techniques you can use to make window functions faster, including proper indexing and usage of batch mode. I also demonstrate the performance difference between RANGE and ROWS.

This wraps up my series on window functions, and although I pack a lot of content into the video, I highly recommend checking out the links for deeper dives into performance.

Comments closed

Generating Gaussian Numbers in SQL Server

Sebastiao Pereira generates random numbers:

Generating random numbers from a normal distribution is essential for accuracy and realistic modeling, simulation, inference, and algorithm design. This is useful for scientific, engineering, statistical, and AI domains. Let’s see how we can create Box Muller transform functionality in SQL Server without the use of external tools.

I’ve used this technique before in SQL Server and it works really well. Performance is fine and the results approximate a normal distribution like you’d want.

Comments closed

Table Switching in SQL Server

Rebecca Lewis swaps in a bag of sand while snatching the golden idol:

Use ALTER TABLE … SWITCH to move very large tables instantly.  Yes.  I said instantly.  ALTER TABLE .. SWITCH doesn’t copy the data or physically move it.  It just reassigns the page ownership.  This means that only the metadata with the data pointer changes, and that’s why it completes in milliseconds and barely touches the transaction log.

There are some strict conditions around when you can use this technique, but it’s as powerful as Rebecca mentions.

Comments closed

Static and Dynamic Bulk Insert into SQL Server

Rick Dobson inserts some data:

There are numerous use cases for multi-file imports of CSV files into a SQL Server table:

  • Dynamic SQL Server bulk insert loads are especially appropriate for tasks that extract content from multiple files to a SQL Server table where the source file names change between successive import jobs.
  • Static bulk insert loads target scenarios where the source file names do not change between successive import jobs.

Read on for examples of how to implement each. Admittedly, bulk insert has rarely worked all that well in my experience, whether due to permissions mishaps, poor data integrity, or sudden changes in data types between file runs. But it does tend to work a lot better if you have a specified data interchange format and a standardized process to prepare the data and make it available on disk for insertion.

Comments closed