Press "Enter" to skip to content

Category: T-SQL

A Thought on Query Granularity

Chris Johnson shares some thoughts:

This query is pretty simple, we’re wanting to return a set of data about the orders placed including the total cost of each order. However the orders in the database are split into an Orders table that holds the order level information, and an OrderLines table that holds information on each product ordered in each order, and we need that OrderLines table to get the cost.

So we write our query, and we join to the OrderLines table, and we sum the cost. But then, because we have an aggregate, we need to group by every other column we’re returning. And to me that just doesn’t look right, and it doesn’t convey the intention of the query properly.

In Chris’s simple example, I’m not sure I’d push it very much, but Chris does have a good point in terms of explaining query intent. Also, depending on how many order lines there are relative to orders (the next step in the chain for that query), aggregation in a common table expression could be faster than waiting until after the join to aggregate on all of the columns. In reality, that’s the most likely reason I’d make this change, assuming that it made a big enough performance difference. But if you take a much more complicated query of this sort, then I’d be more amenable to the argument.

Comments closed

Manual Halloween Problem Protection

Jared Poche takes us through Halloween problem protection and builds out his own method, with blackjack and hookers:

Well, if SQL Server is trying to separate the read from the write, why don’t I just do that myself? I had the idea to read the data I needed in an INSERT…SELECT statement, writing into a memory-optimized table variable (motv). I could make sure the read included all the columns I need to calculate the new value, including the CASE statement I had in the SET clause of the UPDATE.

Click through for Jared’s explanation of how to implement it, as well as the circumstances in which it might be faster than what you get by letting a single T-SQL statement handle the job.

Comments closed

Finding the Max (or Min) Value across Columns

Greg Dodd tries out some new syntax:

Have you ever had 2 columns in a table that you need to select the maximum value from? Not the maximum value from each column, but the biggest value after comparing the columns?

Usually where I have this problem, I’d use a CASE statement, but there’s another way. 

I do like GREATEST() and LEAST() (or argmax/argmin if you’re used to those terms), though Greg does include how you can get the same functionality in versions prior to SQL Server 2022.

Comments closed

Building a Wordle Solver in Powershell and T-SQL

Shane O’Neill takes this T-SQL Tuesday to heart:

I got swept up in that wave for a while in the same way I got swept up in the other trends of my time, like Pokemon, Sodoku, and Angry Birds.

Eventually, I stopped when I found a PowerShell script by Kieran Walsh ( github | twitter ) where you could narrow down to the correct answer by putting in the results of your guess each round.

This hack led to me realising how much time I was spending on Wordle and that I should stop, much like I did with Pokemon, Sodoku, and Angry Birds.

So, what better thing to do than to try and recreate that PowerShell script in T-SQL

And given Shane’s time restriction, I’d say the end result is a good one.

Comments closed

Stuffing Characters with STUFF

John McCormack tries out some stuff:

I don’t often go down code based rabbit holes, its just not what I do day to day but a while back, someone asked on twitter’s #sqlhelp hashtag if there was a character length limit to the STUFF function. The documentation didn’t have an answer. For that reason only, I decided to take a look. 

Click through to see what John learned along the way.

Comments closed

T-SQL and Fun Puzzles

Rob Farley puzzles it out:

Back in my uni days I remember a Prolog assignment to solve “each letter represents a number” puzzles, and my solution being slow. Years later I tried it again and it worked out just fine, but by then the due date was in the past and they weren’t prepared to change my grade.

While these kinds of things can be fun (more so when there aren’t uni grades dependent on the solution), there are also times that it can be fun to rewrite some code in a way that is more intuitive, or that feels clever in a profoundly simple way.

Rob shares links to a few examples along those lines.

Comments closed

Optimizing for Mediocre

Erik Darling is always optimal:

Using the OPTIMIZE FOR UNKNOWN hint, or declaring variables inside of a code block to be used in a where clause have the same issue, though: they make SQL Server’s query optimizer make bad guesses, which often lead to bad execution plans.

You can read great detail about that here.

Read on for a bit of a deserved rant and an example to show why OPTIMIZE FOR UNKNOWN often doesn’t solve the problem.

Comments closed

Pattern Searches on Numbers

Erik Darling does an odd search:

Most of the time, it’s some query that looks like this:

SELECT

c = COUNT_BIG(*)

FROM dbo.Posts AS p

WHERE p.OwnerUserId LIKE ‘2265%’

The cruddy thing here is that… SQL Server doesn’t handle this well.

At first, my thought was, “Why not just use an inequality operator like p.OwnerUserID >= 22650000 AND p.OwnerUserID < 22660000 but then I remembered that we’re probably dealing with multiple orders of magnitude here and the whole thing made my head hurt a little.

Comments closed

Ranking Functions in SQL Server

Chad Callihan gets things sorted:

Have you ever needed a Top 10 list? It may sound like a simple task but not all lists are the same. Consider records where there are ties. Does that mean you need more than 10 records? If two records tie for first does that mean the third record is considered to be ranked second? No matter what type of list you need, SQL Server can help you find your way.

Click through for an overview of three of the four ranking functions in SQL Server. Poor NTILE().

Comments closed

Building Sets with GENERATE_SERIES

Aaron Bertrand is a fan of this new function:

I have come across a lot of use cases for manufacturing rows on the fly, aside from the common goal of populating a large data set such as a numbers or calendar table. A few favorites include building sample data, pivoting an unknown number of columns, data extrapolation, and filling gaps in date or time ranges.

If you are on SQL Server 2022 or Azure SQL Database, or have been reading up on new features, you’ve likely heard about one of the better T-SQL enhancements: a new built-in function called GENERATE_SERIES. The syntax is straightforward – it accepts arguments for start and stop, and an optional argument to indicate step (in case you want to iterate by more than 1, or backwards):

Click through to see how performance for this compares to two methods we’ve used in the past to generate similar results.

Comments closed