T-SQL – Page 81 – Curated SQL

That is very known issue that SQL Server’s XML does not accept characters “&”, “<” and “>”.
There are two more forbidden XML characters ” ‘ ” and ” ” ” (single and double quotes), but SQL Server mostly accept them.
The common solution is to replace these characters by their codes.
Would say we have a silly sentence: “Anne & Robin collect > “berries” than Jane & Kevin, but < than Ivan & Lucy.”

Slava’s post is specifically geared toward wanting to view the characters as-is, not store them for later display. I’m not sure how often that comes up, but it’s a valid use case.

Comments closed

Fastest Way to Delete Lots of Rows in SQL Server

Published 2019-12-05 by Kevin Feasel

Bertrand tries out a few methods to delete data and what SQL Server configuration settings do to this calculus:

That took far longer than I’m comfortable admitting. Part of that was because I had originally included a 0.1% test for rowperloop which, in some cases, took several hours. So I removed those from the table a few days in, and can easily say: if you are removing 1,000,000 rows, deleting 1,000 rows at a time is highly unlikely to be an optimal choice, regardless of any other variables

I think Aaron lays out the caveats pretty well, but I’d reiterate that the main benefit behind chunking delete operations is not so much to make things faster, but to reduce the amount of time you spend blocking more important work, like user queries. And reducing the risk of blowing out the transaction log file (and maybe running out of disk space too).

1 Comment

Partition Switching to Make Table Changes

Published 2019-11-25 by Kevin Feasel

Daniel Hutmacher shows a couple things you can change with near-zero downtime using partition switching:

Look, I’m not saying that you’re the type that would make a change in production while users are working.
But suppose that you would want to add an identity column to dbo.Demo, and change the clustered index to include that identity column, and make the index unique? Because it’s the table’s clustered index, you’re effectively talking about rebuilding the table (remember, the clustered index is the table), which involves reorganizing all of the rows into a new b-tree structure. While SQL Server is busy doing that, nobody will be able to read the contents of the table.

Daniel mentions a read-only table, though you could also do this with a read-write table as long as you have triggers to keep the two tables in sync until go time. That adds to the complexity, but it is an option if you need it.

Comments closed

Character Generation with T-SQL

Published 2019-11-20 by Kevin Feasel

Bill Fellows shows off what you can do with character generation in T-SQL:

The mod or modulus function will return the remainder after division. Modding a value is a handy way to constrain a value between 0 and an upper threshold. In this case, if I modded any number by 26 (because there are 26 characters in the English alphabet), I’ll get 0 to 25 as my result.
Knowing that the modulus function will give me 0 to 25 and knowing that my target character range starts at 65, I could use the previous expression to print any number’s ascii value like SELECT CHAR((2147483625 % 26) + 65) AS StillB;. Break that apart, we do the modulus, %, which gives us the value of 1 which we then add to the starting offset (65).

He also provides a DB Fiddle for it.

Comments closed

Fun with CHAR(0)

Published 2019-11-18 by Kevin Feasel

Kenneth Fisher learns a bit about the 0 byte:

Ok, now things are getting interesting. An ASCII value of 0? I’ve never heard of that. I honestly didn’t know it was possible. As it happens, yes, it’s a real value and in SSMS it does a few strange things.

In the comments Denis Gobo is right: the 0 byte is the null terminator, which should appear at the end of a variable-length string to indicate that there’s nothing more to read there.

Comments closed

View Filters and Short-Circuiting

Published 2019-11-18 by Kevin Feasel

Reitse Eskens takes us through a fun oddity with short-circuiting and views:

The question from my coworker was simple. Why is this happening? Because he’s selecting from the view, his instinct is that the returned result set should be filtered within the view first and that the resultset can be narrowed down further with the regular query.

It’s interesting that there’s deterministic behavior both ways. My recollection is that ANSI SQL does not honor short-circuiting, as all filters are considered to happen at the same time, and thus any ordering is valid. But in practice, there are places where different code bases end up with stable short-circuiting as an implementation detail.

Comments closed

Capturing Inserts and Updates in MERGE Statements

Published 2019-11-11 by Kevin Feasel

The Purple Frog folks show us how to collect the counts of insert and update operations when using MERGE statements:

This post hows how you can capture and store the number of records inserted, updated or deleted from a T-SQL Merge statement.
This is in response to a question on an earlier post about using Merge to load SCDs in a Data Warehouse.
You can achieve this by using the OUTPUT clause of a merge statement, including the $Action column that OUTPUT returns.

Read on for the answer. If only MERGE weren’t so riddled with problems.

Comments closed

Understanding PERCENTILE_CONT

Published 2019-10-21 by Kevin Feasel

Kathi Kellenberger takes us through the PERCENTILE_CONT window function:

I was recently playing with the analytical group of windowing functions, and I wanted to understand how they worked “under the covers.” I ran into a little logic puzzle with PERCENTILE_CONT by trying to write a query that returned the same results using pre-2012 functionality.
Given a list of ranked values, you can use the PERCENTILE_CONT function to find the value at a specific percentile. For example, if you have the grades of 100 students, you can use PERCENTILE_CONT to locate the score in the middle of the list, the median, or at some other percent such as the grade at 90%. This doesn’t mean that the score was 90%; it means that the position of the score was at the 90^th percentile. If there is not a value at the exact location, PERCENTILE_CONT interpolates the answer.

I’m a bit disappointed with how poorly PERCENTILE_CONT performs against large data sets, especially if you need multiple percentiles. It’s bad enough that going into ML Services and getting percentiles with R is usually faster for me. But for datasets of less than 100K or so rows, it’s the easiest non-CLR method to get the median (with the easiest CLR method being SQL#).

Comments closed

Forcing Indexed View Usage

Published 2019-10-11 by Kevin Feasel

Randolph West explains how you can force SQL Server to use an indexed view rather than its underlying tables:

The problem was that the plan was doing a table scan on the underlying table, and not using the indexes I had carefully crafted.
I had even created a second non-clustered index to try and make sure it was a proper covering index for the query.

Read on to see how Randolph solved the problem.

Comments closed

Overlooked T-SQL Functions

Published 2019-10-11 by Kevin Feasel

Itzik Ben-Gan covers some underutilized functions and function overloads in T-SQL:

TRIM is more than LTRIM(RTRIM())
SQL Server 2017 introduced support for the function TRIM. Many people, myself included, initially just assume that it’s no more than a simple shortcut to LTRIM(RTRIM(input)). However, if you check the documentation, you realize that it’s actually more powerful than that.

This article is an excellent argument in favor of reading the documentation, as all of it is in there but it’s easy to miss.

Comments closed

Category: T-SQL

Handling Forbidden XML Characters with SQL Server

Fastest Way to Delete Lots of Rows in SQL Server

Partition Switching to Make Table Changes

Character Generation with T-SQL

Fun with CHAR(0)

View Filters and Short-Circuiting

Capturing Inserts and Updates in MERGE Statements

Understanding PERCENTILE_CONT

Forcing Indexed View Usage

Overlooked T-SQL Functions