Press "Enter" to skip to content

Category: Syntax

The Logic of Common Tale Expressions

Itzik Ben-Gan dives into common table expressions:

The source of the term aside, common table expression, or CTE, is the commonly used term by T-SQL practitioners for the structure that is the focus of this article. So first, let’s address whether it is an appropriate term. We already concluded that the term table expression is appropriate for an expression that conceptually returns a table. Derived tables, CTEs, views and inline table valued functions are all types of named table expressions that T-SQL supports. So, the table expression part of common table expression certainly seems appropriate. As for the common part of the term, it probably has to do with one of the design advantages of CTEs over derived tables. Remember that you cannot reuse the derived table name (or more accurately the range variable name) more than once in the outer query. Conversely, the CTE name can be used multiple times in the outer query. In other words, the CTE name is common to the outer query. Of course, I’ll demonstrate this design aspect in this article.

CTEs give you similar benefits to derived tables, including enabling the development of modular solutions, reusing column aliases, indirectly interacting with window functions in clauses that don’t normally allow them, supporting modifications that indirectly rely on TOP or OFFSET FETCH with order specification, and others. But there are certain design advantages compared to derived tables, which I’ll cover in detail after I provide the syntax for the structure.

Click through for a lot of great detail. On the question of derived tables versus common table expressions, my mental taxonomy is basically APPLY > CTE > derived table, but that’s in a context-free discussion. In practice, all three are useful and the best question to answer is “Which thing helps future developers understand best my intent?”

Comments closed

Simulating DATESERIAL in SQL Server

Madhivanan has nearly a dozen methods to replicate the functionality of the old DateSerial function in VB:

DateSerial function accepts three paramter values year,month and day and return a valid date value with time set to midnight. SQL Server does not support this function until version 2012 is released.

There can be many ways to simulate this functionality. Here are 10 different ways

I’d be pretty upset if some of those were in my code base, but this shows that there are plenty of ways to get to the same result. Just, uh, don’t use all of them.

Comments closed

The Limits of LEN (or REPLICATE)

Pamela Mooney takes us through a quandry:

I was using LEN() to troubleshoot an issue I was having with a dynamically constructed string truncating while inserting into an NVARCHAR(MAX) column.  Since I know that NVARCHAR(MAX) has a 2 GB limit (goodness only knows how many characters that is!),  I couldn’t explain the truncation.  A colleague suggested doing a test with another dynamically constructed string.  Maybe then, I could find where the cutoff was occurring.

Great idea!

So, I came up with a plan.

Click through for the plan, but be sure to read Pamela’s comment at the bottom as there’s a bit more to the story.

Comments closed

Areas of Improvement for DROP TABLE

Michael J. Swart points out a few foibles about the DROP TABLE syntax:

I was looking at the docs for DROP TABLE and I noticed this in the syntax: [ ,...n ]. I never realized that you can drop more than one table in a statement.

I think that’s great. When dropping tables one at a time. You always had to be careful about order when foreign keys were involved. Alas, you still have to care about order.

That is a shame. Michael also includes a few other places where DROP TABLE could be made better, so check it out.

Comments closed

Iterating over JSON and XML Data in SQL Server

Steve Stedman explains how you can iterate through XML and JSON data using the APPLY operator:

The results are what we are looking for in this specific example, but where they break down is when there are more employees represented in the XML, for each employee we need to add another UNION to bring the results together. That is not very iterative and since the title of this post includes the word iterating, we need to focus on how to do that.

Now we introduce the CROSS APPLY functionality that can be used like a JOIN to take a value from one result set (table) and apply it to a function that gets called once for each row. You can reference my JOIN TYPES poster for using CROSS APPLY

Click through for the full set of examples.

Comments closed

Ambiguous Columns in Queries when Using One Table

Dave Bland shows how easy it is to get the “Ambiguous column name” error message when querying from a single table:

When I added the “*”, this is where I received unexpected results.  All I did was add the “*”.  Looking at the code below, you can see SQL Server is having issues with the Name column in the ORDER BY.

I do wish SQL had a symbol representing “everything else,” where the engine of choice would include all columns except those explicitly named. I know there’d be trickiness around things like “LTRIM(ColumnA) AS TrimmedColumnA” but that’d be for the language designers to figure out…

Comments closed

Using VALUES for Multi-Record Operations

Daniel Hutmacher explains some of what you can do with the VALUES clause:

Note the commas at the end of each line, denoting that a new row begins here. Because this runs as a single statement, the INSERT runs as an atomic operation, meaning that all rows are inserted, or none at all (like if there’s a syntax issue or a constraint violation).

I use this construct all the time to generate scripts to import data from various external sources, like Excel, or even a result set in Management Studio or Azure Data Studio.

Daniel also has a new app for us to try out.

Comments closed

An Overview of the T-SQL Script DOM

Dan Guzman provides a public service:

Scripts are parsed by invoking the Parse method of T-SQL script DOM library TSqlParser class. The parser understands the complex T-SQL abstract syntax tree and splits T-SQL source into atomic TSqlParserTokens of TSqlTokenTypes that represent keywords, identifiers, punctuation, literals, whitespace, etc. These low-level tokens are grouped into more meaningful TSqlFragment objects that represent language elements of the script DOM, such as batches, statements, clauses, etc. Fragments, rather than the low-level parser tokens, are most often used in practice, although the underlying tokens are available for specialized requirements

The Parse method returns a TSqlFragment object of type TSqlScript containing all fragments within the script. This top-level fragment of the DOM hierarchy provides programmatic access to all language element fragments in the script. Nearly 1,000 different fragment types exist today due to the many granular T-SQL language elements.

Dan provides several examples of how to use the script DOM, making this a must-read if you’re interested in writing code around SQL Server.

Comments closed

The Rollup and Cube Operators

Greg Dodd digs into the ROLLUP and CUBE operators in a two-parter. First, ROLLUP:

As you can see, we now have these null’s popping up, but with totals. Row 5 for example, tells us that in 2017 there were 1,427,461 people living in Hawaii. Row 11 tells us that there are 2,438,188 people living in Rhode Island and Hawaii in 2017. Row 22 tells us that there were 2,429,070 people living in Rhode Island and Hawaii in 2018, and finally row 23 tells us that in total there have been 4,867,268 people in 2017 and 2018. This last row is a bit useless for this data as the overlap of those people would be huge, but for something like sales data, this number could be useful.

Next, CUBE:

For those with a keen eye you’ll see that I’ve started at row 28 in that screenshot. When we run the GROUP BY without ROLLUP or CUBE we get just 16 rows. With ROLLUP that grows to 23, but with CUBE it explodes out to 57. Why?

I’ve used ROLLUP several times with proper hierarchical data (e.g., product category, product sub-category, product) and it does an excellent job of summarizing that sort of data. CUBE has always returned too many rows for my liking. But the operator I go to most frequently is GROUPING SETS, as then I get to control the levels.

Comments closed