Press "Enter" to skip to content

Category: Syntax

The Halloween Problem, Continued

Paul White continues a series on the Halloween Problem. Part 2 looks at insert and delete statements:

In the first part of this series, we saw how the Halloween Problem applies to UPDATE queries. To recap briefly, the problem was that an index used to locate records to update had its keys modified by the update operation itself (another good reason to use included columns in an index rather than extending the keys). The query optimizer introduced an Eager Table Spool operator to separate the reading and writing sides of the execution plan to avoid the problem. In this post, we will see how the same underlying issue can affect INSERT and DELETE statements.

Part 3 looks at the train wreck MERGE operator:

In the right circumstances, the SQL Server optimizer can recognize that the MERGE statement is hole-filling, which is just another way of saying that the statement only adds rows where there is an existing gap in the target table’s key.

For this optimization to be applied, the values used in the WHEN NOT MATCHED BY TARGET clause must exactly match the ON part of the USING clause. Also, the target table must have a unique key (a requirement satisfied by the PRIMARY KEY in the present case).

Where these requirements are met, the MERGE statement does not require protection from the Halloween Problem.

If only it weren’t busted in so many other ways!

Part 4 wraps up the series:

The SQL Server optimizer has specific features that allow it to reason about the level of Halloween Protection (HP) required at each point in the query plan, and the detailed effect each operator has. These extra features are incorporated into the same property framework the optimizer uses to keep track of hundreds of other important bits of information during its search activities.

Each operator has a required HP property and a delivered HP property. The required property indicates the level of HP needed at that point in the tree for correct results. The delivered property reflects the HP provided by the current operator and the cumulative HP effects provided by its subtree.

This last one goes into some nice detail.

Comments closed

Performance and T-SQL’s CHOOSE

Grant Frichey answers a question:

Questions absolutely drive my blog content and I really liked this one: how does the T-SQL CHOOSE command affect performance.

On the face of it, I honestly don’t think it will affect performance at all, depending on where and how you use it. However, the answer is always best supplied by testing.

Grant’s post ended up being much more interesting than I had anticipated—my anticipated answer was “It’s a CASE statement so it behaves like a CASE statement.” But there is some nuance that I’ve left out.

Comments closed

MERGE in Many Languages

Lukas Eder takes a look at the MERGE statement in SQL:

A few dialects support MERGE. Among the ones that jOOQ 3.13 supports, there are at least:

– Db2
– Derby
– Firebird
– H2
– HSQLDB
– Oracle
– SQL Server
– Sybase SQL Anywhere
– Teradata
– Vertica

For once, regrettably, this list does not include PostgreSQL. But even the dialects in this list do not all agree on what MERGE really is. The SQL standard specifies 3 features, each one optional:

– F312 MERGE statement
– F313 Enhanced MERGE statement
– F314 MERGE statement with DELETE branch

But instead of looking at the standards and what they require, let’s look at what the dialects offer, and how it can be emulated if something is not available.

This is a really cool overview of an area where several vendors can claim support, but that support can mean quite different things. The one caveat is, I don’t know if any of the other platforms’ MERGE operators are as busted as SQL Server’s in terms of bugs.

Comments closed

All About Table Expressions

Itzik Ben-Gan has started a series on table expressions:

Perhaps this will come as a surprise to some, but I actually do find the use of the term table in common table expression as very appropriate. In fact, I find the use of the term table expression as appropriate. To me, the best way to describe what a CTE is in T-SQL, it’s a named table expression. The same applies to what T-SQL calls derived tables (the specific language construct as opposed to the general idea), views and inline TVFs. They are all named table expressions.

If you can bear with me a bit, I’ll provide the reasoning for my view of things in this article. It occurred to me that both the naming confusion, and the confusion around whether there’s a persistency aspect to table expressions, can be cleared with a better understanding of the fundamentals of our field of relational database management systems. Those fundamentals being, relational theory, how SQL (the standard language) relates to it, and how the T-SQL dialect used in the SQL Server and Azure SQL Database implementations relates to both.

There’s a lot of depth in this post, so I recommend a careful reading.

Comments closed

Getting a Substring with DAX

Reza Rad shows us how to build out a substring using DAX:

Substring is one of the most common functions in many languages, However, there is no function named Substring DAX. There is a very simple way of doing it, which I am going to explain in this post. Substring means getting part of a string, for example from “Reza Rad”, if I want to get the start starting from index 2, for 4 characters, it should return “za R”. Considering that the first character is index 0. Let’s see how this is possible.

The answer’s not as pretty as a SUBSTRING() function would be, but it’s also not too far off.

Comments closed

Trimming Strings with T-SQL

Andy Levy saves us all several characters at a time:

Every now and then, we encounter data that needs to be cleaned up because it’s got leading and/or trailing spaces. Or maybe you’re storing short data in a CHAR(N) field, so when you query it, you’re getting trailing spaces. For time immemorial, we’ve had to wrap these fields in rtrim(ltrim(fieldname)) to do the deed.

Effective with SQL Server 2017, that’s no longer the case. 

The eight keystrokes add up over time. In all seriousness, I am happy that TRIM() is a thing in SQL Server 2017. And Andy gives us a little bonus to make it worth your refactoring while.

Comments closed

Supporting Multiple NULL Values with a Unique Constraint

Itzik Ben-Gan walks us through a workaround in T-SQL:

Suppose that you have a database in a platform that supports the standard unique constraint and you need to migrate that database to SQL Server. You may face issues with the enforcement of unique constraints in SQL Server if the unique columns support NULLs. Data that was considered valid in the source system may be considered invalid in SQL Server. In the following sections I’ll explore a number of possible workarounds in SQL Server.

I use a simplified version of this as an interview question, so it’s nice to see an entire article from Itzik on the topic, including a couple solutions way outside the box.

Comments closed

From IDENTITY to Sequences

Andy Levy recommends checking out sequences:

The SEQUENCE object eliminates all of this. It gives you a simple way to just grab a new number and increment for the next caller. It’s very fast as there’s no visible table I/O, and it’s unaffected by rollbacks.

And it’s so much easier to use! You just ask for the next number in the sequence!

I like sequences, though the interesting thing is that 2020 me has created them a lot less frequently than 2012 me was sure I would. I’m glad they’re in the product, however.

One thing I should point out is that sequences are like identity columns in that you can have gaps due to user behavior, such as rolling back transactions. If you absolutely need gap-free sets of numbers, you’re back to Andy’s Method One, except everything has to be serializable and wrapped in explicit transactions.

Comments closed