Category: T-SQL

Short Substrings and Computed Columns

Published 2019-12-31 by Kevin Feasel

Erik Darling gives us a story about computed columns that turns out not to be about computed columns at all (having thereby subverted our expectations):

The problem is that when I tried to index it:
CREATE INDEX dummy ON dbo.Users(DisplayNameComputed);
I got this error:
Msg 537, Level 16, State 3, Line 21 Invalid length parameter passed to the LEFT or SUBSTRING function.
And when I tried to select data from the table, the same error.

Click through to find the real query killer.

Comments closed

Deferred Compilation and Compatibility Level 140 Query Hints

Published 2019-12-23 by Kevin Feasel

Milos Radivojevic shows that table-valued parameters do not care about your QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140 settings:

The plan is created under CL 140, but the estimation number of rows for the table variable is not 1 but the actual one. Even if you would specify the FORCE_LEGACY_CARDINALITY_ESTIMATION hint, the query will be deferred compiled and behavior of table variable would be the same.
Table variable deferred compilation respects settings at the database scope and ignores all hints except one.

Click through to learn what that one hint is.

Comments closed

Compression and Decompression with SQL Server

Published 2019-12-19 by Kevin Feasel

Randolph West asks a pair of questions:

On Twitter recently, I asked:
Does anyone I know use the COMPRESS and DECOMPRESS features in T-SQL?
To those who replied in the affirmative, I asked:
What made you decide on this as opposed to ROW or PAGE compression?

Read on to learn the responses. In my case, I’ve seen COMPRESS and DECOMPRESS used in two places. First, compressing large product descriptions (large enough to go to LOB). Second, I use it to compress binary models created via SQL Server Machine Learning Services. Some of those models compress quite nicely.

Comments closed

More with TOP and Blocking Operators

Published 2019-12-16 by Kevin Feasel

Jared Poche continues an investigation into the TOP operator:

I’ve explained what a blocking operator is and provided a few examples, but maybe this doesn’t seem important. It’s affecting the TOP operator, sure, but don’t people just use this to look at the TOP 1000 rows of their tables in SSMS?
The TOP operator is useful for many operations, especially in a large environment. Large operation can timeout or fail for a variety of reasons, consuming resources without providing the results you need. A small, batch-sized operation is more likely to succeed and tends to perform more consistently. Many maintenance operations make sense to run with a TOP operator, so we should make sure those operations aren’t stymied by blocking operators.

Read on for several examples.

Comments closed

Fun with NULL

Published 2019-12-13 by Kevin Feasel

Itzik Ben-Gan takes us through some of the complexities of NULL:

NULL handling is one of the trickier aspects of data modeling and data manipulation with SQL. Let’s start with the fact that an attempt to explain exactly what a NULL is is not trivial in and of itself. Even among people who do have a good grasp of relational theory and SQL, you will hear very strong opinions both in favor and against using NULLs in your database. Like them or not, as a database practitioner you often have to deal with them, and given that NULLs do add complexity to your SQL code writing, it’s a good idea to make it a priority to understand them well. This way you can avoid unnecessary bugs and pitfalls.
This article is the first in a series about NULL complexities. I start with coverage of what NULLs are and how they behave in comparisons. I then cover NULL treatment inconsistencies in different language elements. Finally, I cover missing standard features related to NULL handling in T-SQL and suggest alternatives that are available in T-SQL.

This is definitely worth the read.

Comments closed

Finding the Max Value Across Multiple Columns

Published 2019-12-12 by Kevin Feasel

Erik Darling shows a couple techniques for finding the maximum value across several columns, whether they’re in one table or in more than one:

It’s sorta kinda pretty crazy when every major database platform has something implemented, and SQL Server doesn’t.
Geez, even MySQL.
But a fairly common need in databases is to find the max value from two columns.
Maybe even across two tables.

Read on to see how you can do this.

Comments closed

Fun with the TOP Operator

Published 2019-12-11 by Kevin Feasel

Jared Poche takes a look at the TOP operator and learns a bit along the way:

Sort is a blocking operator. Don’t feel bad if you haven’t heard of the term; I’ve been working with SQL Server for 15 years, and I’m sure I never heard the term until the incomparable Grant Fritchley mentioned it while he was lecturing at my place of employment.
So sorts and several other types of operators (eager spools, remote query\scan\etc, hash match joins, and more) will block the normal flow and gather all their results before passing any rows on. The hash match join only blocks while building its hash table from the first input, before probing the second.

Read the whole thing. Jared is just getting started with blogging, too, so go pay his blog a visit.

Comments closed

Handling Forbidden XML Characters with SQL Server

Published 2019-12-06 by Kevin Feasel

Slava Murygin shows how we can use Unicode characters to make XML appear to display special characters:

That is very known issue that SQL Server’s XML does not accept characters “&”, “<” and “>”.
There are two more forbidden XML characters ” ‘ ” and ” ” ” (single and double quotes), but SQL Server mostly accept them.
The common solution is to replace these characters by their codes.
Would say we have a silly sentence: “Anne & Robin collect > “berries” than Jane & Kevin, but < than Ivan & Lucy.”

Slava’s post is specifically geared toward wanting to view the characters as-is, not store them for later display. I’m not sure how often that comes up, but it’s a valid use case.

Comments closed

Fastest Way to Delete Lots of Rows in SQL Server

Published 2019-12-05 by Kevin Feasel

Bertrand tries out a few methods to delete data and what SQL Server configuration settings do to this calculus:

That took far longer than I’m comfortable admitting. Part of that was because I had originally included a 0.1% test for rowperloop which, in some cases, took several hours. So I removed those from the table a few days in, and can easily say: if you are removing 1,000,000 rows, deleting 1,000 rows at a time is highly unlikely to be an optimal choice, regardless of any other variables

I think Aaron lays out the caveats pretty well, but I’d reiterate that the main benefit behind chunking delete operations is not so much to make things faster, but to reduce the amount of time you spend blocking more important work, like user queries. And reducing the risk of blowing out the transaction log file (and maybe running out of disk space too).

1 Comment

Partition Switching to Make Table Changes

Published 2019-11-25 by Kevin Feasel

Daniel Hutmacher shows a couple things you can change with near-zero downtime using partition switching:

Look, I’m not saying that you’re the type that would make a change in production while users are working.
But suppose that you would want to add an identity column to dbo.Demo, and change the clustered index to include that identity column, and make the index unique? Because it’s the table’s clustered index, you’re effectively talking about rebuilding the table (remember, the clustered index is the table), which involves reorganizing all of the rows into a new b-tree structure. While SQL Server is busy doing that, nobody will be able to read the contents of the table.

Daniel mentions a read-only table, though you could also do this with a read-write table as long as you have triggers to keep the two tables in sync until go time. That adds to the complexity, but it is an option if you need it.

Comments closed