Press "Enter" to skip to content

Category: Syntax

Optimizing Multiple CTEs

Itzik Ben-Gan continues a series on table expressions:

Last month I explained and demonstrated that CTEs get unnested, whereas temporary tables and table variables actually persist data. I provided recommendations in terms of when it makes sense to use CTEs versus when it makes sense to use temporary objects from a query performance standpoint. But there’s another important aspect of CTE optimization, or physical processing, to consider beyond the solution’s performance—how multiple references to the CTE from an outer query are handled. It’s important to realize that if you have an outer query with multiple references to the same CTE, each gets unnested separately. If you have nondeterministic calculations in the CTE’s inner query, those calculations can have different results in the different references.

Say for instance that you invoke the SYSDATETIME function in a CTE’s inner query, creating a result column called dt. Generally, assuming no change in the inputs, a built-in function is evaluated once per query and reference, irrespective of the number of rows involved. If you refer to the CTE only once from an outer query, but interact with the dt column multiple times, all references are supposed to represent the same function evaluation and return the same values. However, if you refer to the CTE multiple times in the outer query, be it with multiple subqueries referring to the CTE or a join between multiple instances of the same CTE (say aliased as C1 and C2), the references to C1.dt and C2.dt represent different evaluations of the underlying expression and could result in different values.

Definitely worth the read.

Comments closed

Retrieving Text Between Delimiters

Erik Darling takes us through the seedy underbelly of T-SQL:

I have to do this a fair amount, and I always go look at this little cheat sheet that I have.

Then it occurred to me that it might be worth sharing the details here, in case anyone else runs into the same need.

The way I learned to do it is with SUBSTRING and CHARINDEX, which is a pretty common method.

That this is possible is great, but it’d be nice to have an easier approach. Thinking through that easier approach is outside the scope of this post…

Comments closed

More Fun with NULL

Chris Johnson troubleshoots an issue in code:

The poster had a CASE statement and was wondering why it didn’t work as expected. Essentially they were doing something like:

CASE WHEN @a = @b OR (@a IS NULL AND @b IS NULL) THEN 1 ELSE 0
CASE WHEN NOT(@a = @b OR (@a IS NULL AND @b IS NULL)) THEN 1 ELSE 0

And they wanted to know why both were returning 0 when @a or @b were set to NULL. The issue here is that any normal predicate involving NULL returns an unknown. They had tried to compensate with the OR, which got them the result they wanted in the first statement, but didn’t understand why it did that.

Click through for the explanation.

Comments closed

ODBC Scalar Functions

Shane O’Neill discovers ODBC scalar functions:

Can you imagine my shock when I came across a piece of code that not only was not for finding and replacing but even though I did not think it would compile, it did!

If you can imagine my shock, then you’re going to need to increase it more when I tell you that there are a whole family of the same functions!
Here is the code that threw me for a loop the first time I saw it.

SELECT {d '1970-01-01'};

I wasn’t familiar with this syntax either, but if you work heavily with multiple data sources, it can be quite useful—for example, Teradata and DB/2 support them, as well as Shane’s examples of SQL Server and Oracle.

Comments closed

The Logic of Common Tale Expressions

Itzik Ben-Gan dives into common table expressions:

The source of the term aside, common table expression, or CTE, is the commonly used term by T-SQL practitioners for the structure that is the focus of this article. So first, let’s address whether it is an appropriate term. We already concluded that the term table expression is appropriate for an expression that conceptually returns a table. Derived tables, CTEs, views and inline table valued functions are all types of named table expressions that T-SQL supports. So, the table expression part of common table expression certainly seems appropriate. As for the common part of the term, it probably has to do with one of the design advantages of CTEs over derived tables. Remember that you cannot reuse the derived table name (or more accurately the range variable name) more than once in the outer query. Conversely, the CTE name can be used multiple times in the outer query. In other words, the CTE name is common to the outer query. Of course, I’ll demonstrate this design aspect in this article.

CTEs give you similar benefits to derived tables, including enabling the development of modular solutions, reusing column aliases, indirectly interacting with window functions in clauses that don’t normally allow them, supporting modifications that indirectly rely on TOP or OFFSET FETCH with order specification, and others. But there are certain design advantages compared to derived tables, which I’ll cover in detail after I provide the syntax for the structure.

Click through for a lot of great detail. On the question of derived tables versus common table expressions, my mental taxonomy is basically APPLY > CTE > derived table, but that’s in a context-free discussion. In practice, all three are useful and the best question to answer is “Which thing helps future developers understand best my intent?”

Comments closed

Simulating DATESERIAL in SQL Server

Madhivanan has nearly a dozen methods to replicate the functionality of the old DateSerial function in VB:

DateSerial function accepts three paramter values year,month and day and return a valid date value with time set to midnight. SQL Server does not support this function until version 2012 is released.

There can be many ways to simulate this functionality. Here are 10 different ways

I’d be pretty upset if some of those were in my code base, but this shows that there are plenty of ways to get to the same result. Just, uh, don’t use all of them.

Comments closed

The Limits of LEN (or REPLICATE)

Pamela Mooney takes us through a quandry:

I was using LEN() to troubleshoot an issue I was having with a dynamically constructed string truncating while inserting into an NVARCHAR(MAX) column.  Since I know that NVARCHAR(MAX) has a 2 GB limit (goodness only knows how many characters that is!),  I couldn’t explain the truncation.  A colleague suggested doing a test with another dynamically constructed string.  Maybe then, I could find where the cutoff was occurring.

Great idea!

So, I came up with a plan.

Click through for the plan, but be sure to read Pamela’s comment at the bottom as there’s a bit more to the story.

Comments closed

Areas of Improvement for DROP TABLE

Michael J. Swart points out a few foibles about the DROP TABLE syntax:

I was looking at the docs for DROP TABLE and I noticed this in the syntax: [ ,...n ]. I never realized that you can drop more than one table in a statement.

I think that’s great. When dropping tables one at a time. You always had to be careful about order when foreign keys were involved. Alas, you still have to care about order.

That is a shame. Michael also includes a few other places where DROP TABLE could be made better, so check it out.

Comments closed

Iterating over JSON and XML Data in SQL Server

Steve Stedman explains how you can iterate through XML and JSON data using the APPLY operator:

The results are what we are looking for in this specific example, but where they break down is when there are more employees represented in the XML, for each employee we need to add another UNION to bring the results together. That is not very iterative and since the title of this post includes the word iterating, we need to focus on how to do that.

Now we introduce the CROSS APPLY functionality that can be used like a JOIN to take a value from one result set (table) and apply it to a function that gets called once for each row. You can reference my JOIN TYPES poster for using CROSS APPLY

Click through for the full set of examples.

Comments closed

Ambiguous Columns in Queries when Using One Table

Dave Bland shows how easy it is to get the “Ambiguous column name” error message when querying from a single table:

When I added the “*”, this is where I received unexpected results.  All I did was add the “*”.  Looking at the code below, you can see SQL Server is having issues with the Name column in the ORDER BY.

I do wish SQL had a symbol representing “everything else,” where the engine of choice would include all columns except those explicitly named. I know there’d be trickiness around things like “LTRIM(ColumnA) AS TrimmedColumnA” but that’d be for the language designers to figure out…

Comments closed