Press "Enter" to skip to content

Category: T-SQL

The Pain of NULL

Louis Davidson explains the unknown:

There is no simpler topic in relational comparisons than three valued logic. I am being mostly facetious about this, but in reality, it seems so simple that people don’t think about how a NULL works, and make mistakes all of the time. I was reading a post about this the other day on LinkedIn (which by no means could one ever find again!) where one of the comments chastised the author of the post for not understanding “the fundamentals” of relational theory. The original poster wasn’t completely right (and my post may not be completely perfect either, though I will back most of what I write with code.)

In this post I want to point out a few of the key basics that one really should understand.

Click through for a primer on what NULL means and doesn’t mean. And by the time you’re done, I’d like to interest you in the power of 6th normal form, where you can effectively banish NULL into the abyss (at least until you join the bits back together).

Leave a Comment

Working with PIVOT and UNPIVOT in SQL Server

Ed Pollack explains a pair of operators:

There are few operators in T-SQL that cause developers to scramble for documentation more than PIVOT and UNPIVOT. Beyond documentation, transforming columns into rows (and vice-versa) can often be confusing and frustrating for those of us tasked with reformatting data for use by an application.

This article walks through PIVOT and UNPIVOT, providing examples of simple use cases for both – as well as some more complex scenarios we can run into in real-world data. These can be extraordinarily useful ways to reformat data efficiently and quickly with less code than the alternatives. So, there is no need to fear them again!

Click through for Ed’s article. I definitely don’t fear either PIVOT or UNPIVOT and they can be quite useful. But if you locked me in a room and I couldn’t leave until I came up with the proper syntax for both from memory, well, I’d be in that room for a while.

Leave a Comment

Fuzzy Matching in SQL Server 2025

John Deardurff takes a look at a new capability in SQL Server 2025:

Data rarely arrives in perfect condition. Typos, regional spelling differences, and inconsistent formats make exact matching unreliable in real-world scenarios. That’s where fuzzy matching comes in; and SQL Server 2025 introduces powerful built-in functions to handle it directly in T-SQL.

None of the functions are particularly novel, but it is nice to have them directly available in SQL Server, especially because Integration Services (where some of this functionality lived) has been on life support for a decade.

Leave a Comment

A Primer on Partitioned Views

Erik Darling talks about an old-style way of partitioning in SQL Server:

Erik Darling here with Darling Data. And we’re going to finish off this Friday by talking about partitioned views. And look, there are a lot of things I could say about partitioned views that are great and grand and that have come in handy for me over the years in ways that I’m like, wow, thank you partitioned views. Thank you for not being normal table partitioning. Thank you for existing. 

Read on to see how they work, how you can write into them, things that might prevent you from writing into partitioned views directly, and why you probably don’t want writable partitioned views anyhow.

Leave a Comment

Scoring the Quality of Binary Classification with SQL Server

Sebastiao Pereira quantifies a result:

Machine Learning (ML) is a way of teaching computers to learn from data instead of being explicitly programmed. Performance metrics are essential tools for understanding how well a model actually works. They tell you not just how accurate the model is, but how reliablefair, and useful it will be in real-world applications. In other words, without them, machine learning would be a trial-and-error guesswork.

Binary classification is when each sample is labeled as one of two mutually exclusive classes, referenced to a categorization, like positive or negative.

How do you implement the binary classification performance metric in SQL Server without using external tools?

Click through for a series of metrics to determine how well a binary classification process performed. This post doesn’t include details on how to perform the classification, just what to do once you have the results.

Leave a Comment

Tracking Data Changes

Louis Davidson shows a technique:

A few months ago, I wrote a post about comparing sets of data in SQL, This focussed on a type of challenge that is often a one off challenge. The techniques lists (along with a tool like Redgate’s SQL Data Compare aren’t the point of this post, but they are related because when you do change detection from a source, it is super important to check your results occasionally. Having a copy of the complete source to compare to your destination (even if it is just checksums of the data,) is important.

The techniques that Meagan is asking about this month are more ETL related, where you check a stream of data and sync them as changes are made. Typically, you don’t want to compare all the rows in a set, but just the ones that have changed. At some frequency, give me the changes to one set of data and keep it up to date with another.

I’m still partial to using HASHBYTES() on the fields I care about because I don’t trust modified dates unless I know the only way to access that table is via stored procedure, and all of the stored procedures handle updating the modified date correctly.

For row comparisons, I’d also look at EXCEPT for overall row comparison. You do need to do it in both directions, so the pattern is more like A EXCEPT B UNION ALL B EXCEPT A but it works great and natively handles any NULLs along the way.

Leave a Comment

The Power of COALESCE()

Lukas Vileikis shows off a bit of ANSI SQL syntax:

When the need to deal with NULL values arises, multiple queries come onto the scene. The SQL COALESCE function is one of them. In simple terms, the SQL COALESCE function is a ‘fallback’ mechanism for missing data. Its only task is to return the first non-NULL value from a list of values.

I used to be a big believer in COALESCE() all of the time, but it turns out that ISNULL() is faster if you only have two things to compare. Granted, it’s not a huge difference in speed, as I recall, but the difference is there.

Leave a Comment

T-SQL Tricks from Recent Versions

Rebecca Lewis has three tricks for us:

Three T-SQL features that have shipped over the last few releases and quietly retired patterns many of us are still using out of habit. Each replaces a stale workaround with one line of code, and in two of three cases it runs much faster, too. Take a look, try them out.

Click through for Rebecca’s list. And if you want a full talk’s worth of these sorts of things, I happen to have one.

Leave a Comment

Shredding JSON into Rows and Columns via T-SQL

Jared Westover shreds a bit of JSON:

Most databases I see nowadays have at least one column that stores JSON objects as NVARCHAR(MAX). If you look hard enough, I bet you have one. How do you convert JSON objects with arrays into a structured format of columns and rows? Not long ago, a developer asked me that exact question. It’s an important question given how rampant JSON is as a data exchange format, especially for web APIs.

This is a primer on SQL Server’s JSON functionality, at least when it comes to turning JSON into standard tabular data. I think, on the whole, SQL Server does a pretty good job of that, at least as long as your JSON data ultimately fits a tabular format.

Leave a Comment

Preventing SQL Injection in Stored Procedures

Vlad Drumea fixes a procedure:

In the past few years, I’ve seen quite a few stored procedures that rely on dynamic T-SQL without properly guarding for SQL injection.

Some cases were reporting stored procedures, while others were maintenance type stored procedures (e.g. stats updates) that could be kicked off from the app, or even stored procedures that handled app upgrades/patching.

In all these cases, certain portions of the dynamic T-SQL relied on input provided by users via input parameters.

Read on for an example. The solution is still the classic combination of QUOTENAME() and sp_execute_sql whenever you have user input.

Comments closed