Press "Enter" to skip to content

Category: T-SQL

Testing Implicit Conversion and Performance in SQL Server

Louis Davidson runs some tests:

If you have ever done any performance tuning of queries in SQL Server, no doubt one of the first thing you have heard is that your search argument data types need to match the columns that you are querying. Not one thing in this blog is going to dispute that. Again, the BEST case is that if your column is an nvarchar, your search string matches that column datatype. But why is this? I will do my best to make this pretty clear, especially why it doesn’t always matter.

Read on as Louis lays out the explanation.

Leave a Comment

NTILE and Uneven Row Distribution

Jared Westover clarifies:

One of the simplest yet least-popular ranking functions in T-SQL is NTILE. It’s useful for dividing data into buckets or tiles. However, when your data isn’t evenly distributed across buckets, the results are confusing. Also, NTILE sometimes returns rows in a seemingly random order. What’s happening here?

There’s absolutely a pattern to how NTILE() works, as Jared describes.

Leave a Comment

Implementing Shamir’s Secret Sharing in SQL Server

Sebastiao Pereira implements an algorithm:

Shamir’s Secret Sharing is a cryptographic algorithm that allows a secret to be split into multiple components and shared among a group in such a way that the secret can only be revealed if a minimum number of components are combined. Is it possible to have this algorithm implemented in SQL Server without using external tools?

Click through for a T-SQL implementation, as well as one using CLR.

Leave a Comment

Combining UNION and UNION ALL

Greg Low crosses the streams:

Until the other day though, I’d never stopped to think about what happens when you mix the two operations. I certainly wouldn’t write code like that myself but for example, without running the code (or reading further ahead yet), what would you expect the output of the following command to be? (Note: The real code read rows from a table but I’ve mocked it up with a VALUES clause to make it easier to see the outcome).

Read on to see what happens.

Leave a Comment

Creating Data from Literals in SQL Server

Louis Davidson has values. Many, many values:

Row Creators were introduced in SQL Server 2008, and allow you to create multiple rows in a single INSERT statement by using the VALUES clause. In this blog, I will demonstrate a few ways that we have created data in tables, and then show how you can do this with row constructors.

It’s not the only neat trick with VALUES(), either: you can also use CROSS APPLY and VALUES() to perform an efficient unpivot, turning a long virtual table into a wide virtual table.

Leave a Comment

Running Totals over Arbitrary Date Ranges

Louis Davidson solves an interval problem:

Say you want to find the most recent 30-day period during which a person purchased some amount of products from your company. How you market to a customer might change if they have been active over a time period recently, or even in the past. But this also means that for each day going back in history, you need to sum historic data over and over, and the previous 29 days of activity. This is generally known as a rolling total. Doing this sort of calculation has been an interesting problem for many years.

When window functions came around, they became quite useful for such tasks, but they have one kind of complicated problem: gaps in source data patterns.

Funnily enough, there is a solution using window functions: range intervals. The ANSI SQL definition for RANGE (versus ROWS) for window functions does allow for the specification of a date range, like RANGE BETWEEN INTERVAL '30' DAY PRECEDING AND CURRENT ROW. Very impressive.

Unfortunately, SQL Server doesn’t support these. PostgreSQL does, but it’s an area I’ve agitated about for a few years and I do hope that someday, the SQL Server product team will support this functionality. In the meantime, Louis has a solution that works well for the task.

Comments closed

SELECT * in EXISTS Redux

Louis Davidson follows up from a prior post:

For example, it is often said that SELECT * makes your queries slower. In a nuanced way, this is often true, but only if changes occur in the database where columns are added. So many readers (myself included) see something that is demonstrably not 100% being treated as such, and they tune out.

There are plenty of other reasons you shouldn’t use that construct, no matter what.

In this post, I want to admit to having my mind changed, and I will go back and change the previous post.

One thing I really appreciate about Louis is his willingness to listen to new information, update his priors, and outright say “Hey, here’s what I thought before and now I believe this instead.” That’s a commendable trait.

Comments closed

Diving into DISTINCT

Louis Davidson is one of a kind:

If there is one SQL keyword that causes more fear than any other, it’s DISTINCT. When I see it in a query, I immediately start to worry about just how much work I am in for to ensure the correctness of that query. I start scanning for comments to describe why it is there, and if none are found, I know the query is probably going to be wrong.

I have seen DISTINCT used to hide bad joins, missing grouping, and even missing WHERE clauses. I have seen developers use it as a “fix-all” for data problems.

In this blog, I will look at the proper use and distinctly dangerous uses of DISTINCT and also show how you might test your query that uses DISTINCT to see what it is actually covering up.

Louis also includes one of my “favorite” coding errors: the accidental self-join. Done that one too many times to be proud of.

Comments closed

Choosing DENSE_RANK() vs RANK()

Jared Westover explains the difference:

Recently, a developer asked for my input on solving a complex SQL query problem. As we went through each step, I suggested using a ranking function at one point, specifically RANK. However, they were having trouble deciding when to use RANK versus DENSE_RANK in a query. It raises a good question: When should you use DENSE_RANK instead of RANK?

Jared’s analogies for when to use which are very similar to the ones I use: a race versus levels.

Comments closed

Concatenation via Double Pipe Operator in SQL Server 2025

Rajendra Gupta shows off a new operator:

SQL Server 2025 adds the double pipe (||) operator for string concatenation. What is the double pipe (||) operator, and how different is it from the existing plus (+) and CONCAT function for concatenation? Let’s check it out in this article.

I still prefer CONCAT() and CONCAT_WS() for display, and would be indifferent between += and ||= for appending strings. But for companies that need to write ANSI-compliant code, it’s a positive.

Comments closed