Press "Enter" to skip to content

Category: T-SQL

Finding the Max Value Across Multiple Columns

Erik Darling shows a couple techniques for finding the maximum value across several columns, whether they’re in one table or in more than one:

It’s sorta kinda pretty crazy when every major database platform has something implemented, and SQL Server doesn’t.

Geez, even MySQL.

But a fairly common need in databases is to find the max value from two columns.

Maybe even across two tables.

Read on to see how you can do this.

Comments closed

Fun with the TOP Operator

Jared Poche takes a look at the TOP operator and learns a bit along the way:

Sort is a blocking operator. Don’t feel bad if you haven’t heard of the term; I’ve been working with SQL Server for 15 years, and I’m sure I never heard the term until the incomparable Grant Fritchley mentioned it while he was lecturing at my place of employment.

So sorts and several other types of operators (eager spools, remote query\scan\etc, hash match joins, and more) will block the normal flow and gather all their results before passing any rows on. The hash match join only blocks while building its hash table from the first input, before probing the second.

Read the whole thing. Jared is just getting started with blogging, too, so go pay his blog a visit.

Comments closed

Handling Forbidden XML Characters with SQL Server

Slava Murygin shows how we can use Unicode characters to make XML appear to display special characters:

That is very known issue that SQL Server’s XML does not accept characters “&”, “<” and “>”.
There are two more forbidden XML characters ” ‘ ” and ” ” ” (single and double quotes), but SQL Server mostly accept them.

The common solution is to replace these characters by their codes.
Would say we have a silly sentence: “Anne & Robin collect > “berries” than Jane & Kevin, but < than Ivan & Lucy.

Slava’s post is specifically geared toward wanting to view the characters as-is, not store them for later display. I’m not sure how often that comes up, but it’s a valid use case.

Comments closed

Fastest Way to Delete Lots of Rows in SQL Server

Bertrand tries out a few methods to delete data and what SQL Server configuration settings do to this calculus:

That took far longer than I’m comfortable admitting. Part of that was because I had originally included a 0.1% test for rowperloop which, in some cases, took several hours. So I removed those from the table a few days in, and can easily say: if you are removing 1,000,000 rows, deleting 1,000 rows at a time is highly unlikely to be an optimal choice, regardless of any other variables

I think Aaron lays out the caveats pretty well, but I’d reiterate that the main benefit behind chunking delete operations is not so much to make things faster, but to reduce the amount of time you spend blocking more important work, like user queries. And reducing the risk of blowing out the transaction log file (and maybe running out of disk space too).

1 Comment

Partition Switching to Make Table Changes

Daniel Hutmacher shows a couple things you can change with near-zero downtime using partition switching:

Look, I’m not saying that you’re the type that would make a change in production while users are working.

But suppose that you would want to add an identity column to dbo.Demo, and change the clustered index to include that identity column, and make the index unique? Because it’s the table’s clustered index, you’re effectively talking about rebuilding the table (remember, the clustered index is the table), which involves reorganizing all of the rows into a new b-tree structure. While SQL Server is busy doing that, nobody will be able to read the contents of the table.

Daniel mentions a read-only table, though you could also do this with a read-write table as long as you have triggers to keep the two tables in sync until go time. That adds to the complexity, but it is an option if you need it.

Comments closed

Character Generation with T-SQL

Bill Fellows shows off what you can do with character generation in T-SQL:

The mod or modulus function will return the remainder after division. Modding a value is a handy way to constrain a value between 0 and an upper threshold. In this case, if I modded any number by 26 (because there are 26 characters in the English alphabet), I’ll get 0 to 25 as my result.

Knowing that the modulus function will give me 0 to 25 and knowing that my target character range starts at 65, I could use the previous expression to print any number’s ascii value like SELECT CHAR((2147483625 % 26) + 65) AS StillB;. Break that apart, we do the modulus, %, which gives us the value of 1 which we then add to the starting offset (65).

He also provides a DB Fiddle for it.

Comments closed

Fun with CHAR(0)

Kenneth Fisher learns a bit about the 0 byte:

Ok, now things are getting interesting. An ASCII value of 0? I’ve never heard of that. I honestly didn’t know it was possible. As it happens, yes, it’s a real value and in SSMS it does a few strange things.

In the comments Denis Gobo is right: the 0 byte is the null terminator, which should appear at the end of a variable-length string to indicate that there’s nothing more to read there.

Comments closed

View Filters and Short-Circuiting

Reitse Eskens takes us through a fun oddity with short-circuiting and views:

The question from my coworker was simple. Why is this happening? Because he’s selecting from the view, his instinct is that the returned result set should be filtered within the view first and that the resultset can be narrowed down further with the regular query.

It’s interesting that there’s deterministic behavior both ways. My recollection is that ANSI SQL does not honor short-circuiting, as all filters are considered to happen at the same time, and thus any ordering is valid. But in practice, there are places where different code bases end up with stable short-circuiting as an implementation detail.

Comments closed

Capturing Inserts and Updates in MERGE Statements

The Purple Frog folks show us how to collect the counts of insert and update operations when using MERGE statements:

This post hows how you can capture and store the number of records inserted, updated or deleted from a T-SQL Merge statement.

This is in response to a question on an earlier post about using Merge to load SCDs in a Data Warehouse.

You can achieve this by using the OUTPUT clause of a merge statement, including the $Action column that OUTPUT returns.

Read on for the answer. If only MERGE weren’t so riddled with problems.

Comments closed

Understanding PERCENTILE_CONT

Kathi Kellenberger takes us through the PERCENTILE_CONT window function:

I was recently playing with the analytical group of windowing functions, and I wanted to understand how they worked “under the covers.” I ran into a little logic puzzle with PERCENTILE_CONT by trying to write a query that returned the same results using pre-2012 functionality.

Given a list of ranked values, you can use the PERCENTILE_CONT function to find the value at a specific percentile. For example, if you have the grades of 100 students, you can use PERCENTILE_CONT to locate the score in the middle of the list, the median, or at some other percent such as the grade at 90%. This doesn’t mean that the score was 90%; it means that the position of the score was at the 90th percentile. If there is not a value at the exact location, PERCENTILE_CONT interpolates the answer.

I’m a bit disappointed with how poorly PERCENTILE_CONT performs against large data sets, especially if you need multiple percentiles. It’s bad enough that going into ML Services and getting percentiles with R is usually faster for me. But for datasets of less than 100K or so rows, it’s the easiest non-CLR method to get the median (with the easiest CLR method being SQL#).

Comments closed