Erik Darling answers an office hours question in detail. The question comes down to why a filter on ROW_NUMBER()
where the row number is equal to 1 could differ from the same query where row number is less than or equal to 1. Knowing that ROW_NUMBER()
starts at 1 and can never be anything other than a natural number, you’d think that SQL Server would treat these exactly the same. But Erik shows an example where the two can differ, and the answer was a good one. I will admit that my pre-video guess was wrong but once he showed the execution plans, things clicked. And, like Erik mentions, this is why it’s so important to dig into the execution plan, because the answers are typically in there somewhere.
Author: Kevin Feasel
Gilbert Quevauvilliers grabs a query:
Load testing is essential when working with Microsoft Fabric capacity. With limited resources, deploying a Power BI report without testing can lead to performance issues, downtime, and frustrated users. In this series, I’ll show you how to automate load testing using Fabric Notebooks, making the process faster, easier, and repeatable.
Inspired by Phil Seamark’s approach, this method eliminates manual complexity and allows you to capture real user queries for accurate testing.
Read on for the first part, in which Gilbert uses the Performance Analyzer to capture query details.
Leave a CommentCourtney Woolum splits a string:
If you’ve escaped string parsing thus far, count thyself lucky. I’ve seen some numbing scripts from pre-SQL Server 2016 when STRING_SPLIT was introduced. I think the fact this month’s T-SQL Tuesday is entirely on string parsing says enough about how many ways there are to paint this particular cat.
In the post, Courtney mentions learning early on to avoid using the APPLY operator. I want to have harsh words with whoever taught her that. Purposefully avoiding the APPLY operator artificially hamstrings your ability to write effective T-SQL code.
Leave a CommentHugo Kornelis prefers surrogate keys:
I am currently most known for my performance tuning and execution plan work. But when I started working with database, I actually came from a background of data modelling, database design, and normalization. And that has never fully left me. In fact, I have in the past two years created a whole series of YouTube videos about database design and normalization. And a much longer time ago, I recorded a Pluralsight course on this topic that is still available for viewing.
One of the very basics of schema design for a relational database is to store atomic values in every column. One column, one value, no more, no less. That automatically rules out all repeating groups. So a single column to list all my email addresses? Sorry. You are doing it wrong. And you will pay the price when you try to protect the integrity of your data. Or even just report on it.
I wanted to copy this second paragraph because CJ Date, in his book Database Design and Relational Theory (2nd edition), issues a mea culpa around repeating groups, stating that it’s best to ignore his prior arguments on the topic. Though in Date’s case, he specifies a repeating group as something like { Name, Email1, Email2, Email3, … } rather than a delimited list.
But even composite items can be in 1st normal form. For example, a US telephone number has a country code (+1), an area code, an exchange, and a four-digit number, followed potentially by an extension. The name “Bob” is an array of characters, and each array of characters is a composite of bits forming 1-4 bytes depending on collation and other details. At the end of the day, first normal form is about the shape of the tuple (a heading exists with a known set of names and data types; all tuples follow the same header; no duplicate tuples are allowed; attribute and tuple order does not matter for operations; and all attributes are regular inasmuch as they have names, data types, are not hidden, etc.).
At the end of the day, what Hugo is saying is good practice: if you have a business need to identify segments of an attribute separately, then it makes sense for each segment to be an attribute on its own. But because there is no solid mathematical property that explains exactly what an attribute is, and because database normalization is ultimately a series of mathematical formulations, we cannot use normalization as the reason to keep or separate the contents of an attribute. Thankfully, there is more to database design than normalization alone (and thankfully, database normalization itself is such a robust field that provides good advice that people should follow).
Leave a CommentAndy Brownsword digs into a nice capability around using regular expressions:
Now that we’re on the cusp of adoption within SQL Server, it’ll be a valuable tool there too. However, after trying it out last week, one omission stood out – one of my favourite features for string parsing: Named Groups.
Where a usual expression can be used to match a string, Named Groups can also be used to extract details from the string. Using an example from Steve’s invitation where a PO number
20260720321433
begins with a year/month and then a number, this could be split with named groups:
Andy mentions the community displeasure for CLR and that displeasure annoys me to no end. I think 90% of the hysteria around CLR in SQL Server was a misunderstanding in terms and unwillingness to learn other programming languages. If you ever catch me in person, I’ll rant about it at length.
Leave a CommentLouis Davidson goes looking for list items:
From the title of “Favorite String Parsing”, I will say 100% it is using SQL Server 2025’s addition of Regular Expressions. Previously, parsing text in SQL Server was one of my least favorite things to do. Regular expressions will make it just a bit nicer, because it has a lot more power than
SUBSTRING
,LEFT
,RIGHT
, andCHARINDEX
/PATINDEX
. All generally “good enough” functions for a lot of the things you need to do, but often woefully inadequate for parsing large amounts of text.
Yeah, T-SQL RegEx is definitely a nicer approach, though HTML doesn’t have to follow the consistency rules of XML due to browsers being very forgiving in their interpretation of the language, so it’s easy to get tangled up trying to parse websites.
Leave a CommentElena Drakulevska builds a switch:
We learned in the last post that while dark UI feels sleek, it’s not automatically accessible and it shouldn’t be your default strategy (hello, contrast + glare). In most cases, light mode is the more accessible baseline (just imagine trying to work on a sunny beach or on your balcony with dark mode… nightmare).
But UX is also about choice. Some users love light, some swear by dark. So let’s give them control.
Read on to see how, without sacrificing much accessibility.
Leave a CommentSvetlana Golovko wants a report of failing reports:
We would like to get notifications if any SSRS report subscriptions fail. This feature was working great until it wasn’t. The scheduled SSRS subscription in the SQL Server Agent job didn’t fail and we were not aware of any failures. In some cases, a partial data driven subscription failed.
Moreover, after some testing we realized that the SQL Server Agent job won’t fail for the Standard subscriptions (that are available in SQL Server Standard Edition) either.
Read on to see what happens when a report subscription delivery fails and how you can get details on the failure.
Leave a CommentYe Xu makes use of the Copy job:
Copy job is the go-to solution in Microsoft Fabric Data Factory for simplified data movement, whether you’re moving data across clouds, from on-premises systems, or between services. With native support for multiple delivery styles, including bulk copy, incremental copy, and change data capture (CDC) replication, Copy job offers the flexibility to handle a wide range of data movement scenarios—all through an intuitive, easy-to-use experience. Learn more in What is Copy job in Data Factory – Microsoft Fabric | Microsoft Learn.
With Copy job, you can also perform cross-tenant data movement between Fabric and other clouds, such as Azure. It also enables cross-tenant data sharing within OneLake, allowing you to copy data across Fabric Lakehouse, Warehouse, and SQL DB in Fabric between tenants with SPN support. This blog provides step-by-step guidance on using Copy job to copy data across different tenants.
Click through for a demonstration, as well as the security permissions that are necessary for this to work.
Leave a CommentRob Farley shares some thoughts:
But let’s talk about non-regex methods for parsing strings and the patterns that I use. I find that the biggest issue with most string parsing is complexity. Even something as simple as finding the value between the 2nd and 3rd hyphens can be done in different ways with different levels of complexity, and even if it works, maintaining that code can become really hard.
For example, finding the position of the first hyphen might be as simple as using the CHARINDEX function. Finding the second might involve two CHARINDEX functions, and calling SUBSTRING with parameters that have increasingly nested CHARINDEX calls… well, you can see how the complexity quickly builds
Rob digs into one of my favorite use cases for the APPLY operator: simplifying calculations, or in this case simplifying expression chains. Granted, I have also grown to appreciate the DuckDB solution of allowing for function chaining. The demo examples in that documentation are limited but you can do things like goose_name.lower().replace('goose', 'duck').replace(' ', '')
and it will work fine.