Category: T-SQL

Building Queues in the Database

Published 2020-05-14 by Kevin Feasel

Erik Darling has created a series on queue tables in the database. Part one builds out a table:

A little explanation of the table: this is good for a reusable, ordered queue, that’ll give us some feedback on how long things took. It could also be used is there were a pending element, but we’d probably wanna change the indexing so that we could find either the last start time, or the last end time efficiently.
The thing that probably needs the most explanation here is the indexing and constraints. Because I know you, and you’re staring at the way I have my primary key, and you’re getting an itch and a twitch. That’s okay, it’s you’ve been conditioned for years to put the most selective column first. Unfortunately, that won’t work here.

Part two takes us through querying the queue:

For the table, our main priorities were indexing to make sure we can find work easily, and not allowing duplicate items.
For our worker proc, the main goals are going to be
– Looping until we run out of work
– Finding and reserving work with minimal locking
– Making that process atomic
Thankfully, this is a lot easier than it sounds. The table design does a lot of the work for us.

Erik’s design is not one I typically reach for, though my constraints are a bit different from his—typically, I’m using queue tables to run on periodic schedules and grab batches of records which finish processing before the next timed batch begins, and processes are idempotent, so if a queued item re-runs on occasion, it’s okay. But this is a really good technique if you need a more robust solution.

Comments closed

Never Forget a WHERE Clause Again

Published 2020-05-14 by Kevin Feasel

If you frequently forget to include the WHERE clause when you run queries, Rob Farley has an answer for you:

You meant to only update a few rows, but you weren’t paying attention to what you’d selected before hitting the Execute button. The WHERE clause got missed from what you executed, and you’ve updated every row. Oops.
Now, I totally hear you that you could’ve put this in a transaction, checked your results and then only done a COMMIT when you were sure. You could even have put that COMMIT inside an IF block that checked @@ROWCOUNT… yes, yes, yes. You could have used an SSMS add-in to let you execute the whole query that you’re currently on. You could’ve used a different tool entirely, such as Azure Data Studio. There are plenty of ways to avoid this. But I’m going to show you something that I find works for me, especially when I’m giving the script to someone else to run.

It’s more effort and makes the code harder to read, but if you have a real concern about somebody (and that somebody can include you) goofing this up, it does the job.

Comments closed

The Value of Negative Identity Values

Published 2020-05-07 by Kevin Feasel

Randolph West explains why you might start at MIN(INT) for an identity integer column:

A quick(er) post this week, in response to Greg Low’s blog post from a few weeks ago titled “Don’t start identity columns or sequences with large negative values.”
Greg writes that you shouldn’t use large negative values in a table, because… it’s hard to read them, I guess? And also they don’t compress well.
I disagree … to a degree. Dang, words are hard. Anyway, when I design a table I create what’s called a surrogate key as my primary key, which is a value that is intended for the table to uniquely identify a row so that it participates in relational activities like joins and foreign keys in an efficient way. In other words the identity column is not for me, it’s for the database engine. I don’t need to worry about what row a value has. I choose the data type for that identity column based on the estimated number of rows, not whether I can memorize that a [StatusID] of 5 means something. Magic numbers are bad, mmmkay?

I don’t mind using negative values, especially for things like queue tables where the rows are ephemeral. The identity values may be harder to read, but as Randolph points out, in those types of cases, you aren’t really reading the values anyhow.

Comments closed

XML and JSON Creation in SQL Server vs Elsewhere

Published 2020-05-07 by Kevin Feasel

Lukas Eder walks us through converting result sets to XML and JSON using different platforms:

SQL Server supports transforming flat tabular SQL result sets into hierarchical structures by convention using the convenient FOR XML or FOR JSON syntaxes. This is really convenient and less verbose than the standard SQL/XML or SQL/JSON APIs – although the standard ones are more powerful.
In this blog post, I’d like to show a few core features of the SQL Server syntax, and what they correspond to in standard SQL. jOOQ 3.14 will support both SQL Server’s syntax and the standard syntax, and will be able to translate from one to the other, such that you can use SQL Server syntax also on Db2, MariaDB, MySQL, Oracle, PostgreSQL. You can play around with the current state of development on our website here.

Click through for a series of examples.

Comments closed

WAITFOR TIME Instead of DELAY

Published 2020-05-07 by Kevin Feasel

Denis Gobo points out that there is a second option when using WAITFOR:

I was looking at some code I wrote the other day and noticed the WAITFOR command.. This got me thinking. How many times have I used WAITFOR in code, probably as much as I have used NTILE 🙂
I looked at the documentation for WAITFOR and notice there is TIME in addition to DELAY. Oh that is handy, I always rolled my own ghetto-style version by calculating how long it would be until a specific time and then I would use that in the WAITFOR DELAY command

Read on to see why you might use WAITFOR and how to use each option. I use it a bit more frequently than Denis, though I still need to run a test each time to get the syntax right…

Comments closed

Issues with Window Functions and Views / Derived Tables

Published 2020-05-06 by Kevin Feasel

Paul White takes us through some of the difficulties you are liable to see when using window functions in views and derived tables:

Our expectation is that the execution plan for this new query will be exactly the same as before we created the view. The query optimizer should be able to push the product filter specified in the WHERE clause down into the view, resulting in an index seek.
We need to stop and think a bit at this point, however. The query optimizer can only produce execution plans that are guaranteed to produce the same results as the logical query specification. Is it safe to push our WHERE clause into the view?

Read the whole thing.

Comments closed

Extracting Letters or Numbers from a String

Published 2020-05-05 by Kevin Feasel

Erik Darling has a T-SQL solution (using tally tables) for extracting letters or numbers from a string:

My solutions use a numbers table. You’re free to try replacing that aspect of them with an inlined version like Jeff Moden uses in his string splitter, but I found the numbers table approach faster. Granted, it’s also less portable, but that’s a trade-off I’m willing to make.
What I don’t like about either solution is that I have to re-assemble the string using XML PATH. If you’ve got another way to do that, I’m all ears. I know 2017 has STRING_AGG, but that didn’t turn out much better, and it wouldn’t be usable in other supported versions.

I agree that the best route here is just to suck it up and use CLR, but if you’re going to live in a T-SQL-only world, this is probably the best method available.

Comments closed

The Halloween Problem, Continued

Published 2020-05-04 by Kevin Feasel

Paul White continues a series on the Halloween Problem. Part 2 looks at insert and delete statements:

In the first part of this series, we saw how the Halloween Problem applies to UPDATE queries. To recap briefly, the problem was that an index used to locate records to update had its keys modified by the update operation itself (another good reason to use included columns in an index rather than extending the keys). The query optimizer introduced an Eager Table Spool operator to separate the reading and writing sides of the execution plan to avoid the problem. In this post, we will see how the same underlying issue can affect INSERT and DELETE statements.

Part 3 looks at the train wreck MERGE operator:

In the right circumstances, the SQL Server optimizer can recognize that the MERGE statement is hole-filling, which is just another way of saying that the statement only adds rows where there is an existing gap in the target table’s key.
For this optimization to be applied, the values used in the WHEN NOT MATCHED BY TARGET clause must exactly match the ON part of the USING clause. Also, the target table must have a unique key (a requirement satisfied by the PRIMARY KEY in the present case).
Where these requirements are met, the MERGE statement does not require protection from the Halloween Problem.

If only it weren’t busted in so many other ways!

Part 4 wraps up the series:

The SQL Server optimizer has specific features that allow it to reason about the level of Halloween Protection (HP) required at each point in the query plan, and the detailed effect each operator has. These extra features are incorporated into the same property framework the optimizer uses to keep track of hundreds of other important bits of information during its search activities.
Each operator has a required HP property and a delivered HP property. The required property indicates the level of HP needed at that point in the tree for correct results. The delivered property reflects the HP provided by the current operator and the cumulative HP effects provided by its subtree.

This last one goes into some nice detail.

Comments closed

Foreign Keys and Non-Changing Updates

Published 2020-04-30 by Kevin Feasel

Brent Ozar has a warning for us:

If you update a row without actually changing its contents, does it still hurt?
Paul White wrote in detail about the impact of non-updating updates, proving that SQL Server works hard to avoid doing extra work where it can. That’s a great post, and you should read it.
But foreign keys add another level of complexity. If you have foreign keys, and you tell SQL Server that you’re updating the contents of those related columns, SQL Server will check ’em even when the data isn’t changing.

Click through for the demonstration. I don’t think I agree with Brent’s dichotomy as laid out at the end of the post—the back-and-forth about removing keys would only make sense if you’re on the edge of the database equivalent of the production possibility frontier and expecting to move well beyond that point very soon. I’m not sure how well that describes the average company, but it’s a side quibble.

Comments closed

Performance and T-SQL’s CHOOSE

Published 2020-04-28 by Kevin Feasel

Grant Frichey answers a question:

Questions absolutely drive my blog content and I really liked this one: how does the T-SQL CHOOSE command affect performance.
On the face of it, I honestly don’t think it will affect performance at all, depending on where and how you use it. However, the answer is always best supplied by testing.

Grant’s post ended up being much more interesting than I had anticipated—my anticipated answer was “It’s a CASE statement so it behaves like a CASE statement.” But there is some nuance that I’ve left out.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31