Press "Enter" to skip to content

Category: T-SQL

Iterating over JSON and XML Data in SQL Server

Steve Stedman explains how you can iterate through XML and JSON data using the APPLY operator:

The results are what we are looking for in this specific example, but where they break down is when there are more employees represented in the XML, for each employee we need to add another UNION to bring the results together. That is not very iterative and since the title of this post includes the word iterating, we need to focus on how to do that.

Now we introduce the CROSS APPLY functionality that can be used like a JOIN to take a value from one result set (table) and apply it to a function that gets called once for each row. You can reference my JOIN TYPES poster for using CROSS APPLY

Click through for the full set of examples.

Comments closed

Derived Table Nesting and Performance

Itzik Ben-Gan digs into some of the performance considerations around nested derived tables:

Unnesting/substitution of table expressions is a process of taking a query that involves nesting of table expressions, and as if substituting it with a query where the nested logic is eliminated. I should stress that in practice, there’s no actual process in which SQL Server converts the original query string with the nested logic to a new query string without the nesting. What actually happens is that the query parsing process produces an initial tree of logical operators closely reflecting the original query. Then, SQL Server applies transformations to this query tree, eliminating some of the unnecessary steps, collapsing multiple steps into fewer steps, and moving operators around. In its transformations, as long as certain conditions are met, SQL Server can shift things around across what were originally table expression boundaries—sometimes effectively as if eliminating the nested units. All of this in attempt to find an optimal plan.

In this article I cover both cases where such unnesting takes place, as well as unnesting inhibitors. That is, when you use certain query elements it prevent SQL Server from being able to move logical operators in the query tree, forcing it to process the operators based on the boundaries of the table expressions used in the original query.

That’s on my list for a second reading.

Comments closed

Ambiguous Columns in Queries when Using One Table

Dave Bland shows how easy it is to get the “Ambiguous column name” error message when querying from a single table:

When I added the “*”, this is where I received unexpected results.  All I did was add the “*”.  Looking at the code below, you can see SQL Server is having issues with the Name column in the ORDER BY.

I do wish SQL had a symbol representing “everything else,” where the engine of choice would include all columns except those explicitly named. I know there’d be trickiness around things like “LTRIM(ColumnA) AS TrimmedColumnA” but that’d be for the language designers to figure out…

Comments closed

Using VALUES for Multi-Record Operations

Daniel Hutmacher explains some of what you can do with the VALUES clause:

Note the commas at the end of each line, denoting that a new row begins here. Because this runs as a single statement, the INSERT runs as an atomic operation, meaning that all rows are inserted, or none at all (like if there’s a syntax issue or a constraint violation).

I use this construct all the time to generate scripts to import data from various external sources, like Excel, or even a result set in Management Studio or Azure Data Studio.

Daniel also has a new app for us to try out.

Comments closed

CTEs Don’t Control Plan Shape

Erik Darling dispels a myth:

I’ve heard many times incorrectly over the years that CTEs somehow materialize data.

But a new one to me was that CTEs execute procedurally, and you could use that to influence plan shapes by always doing certain things first.

Unfortunately, that’s not true of them either, even when you use TOP.

Read the whole thing. Though I do chain common table expressions for readability’s sake, but that’s usually because I’m performing a series of repetitive calculations that I can’t simplify via APPLY.

Comments closed

I Remember Halloween

Jared Poche experiences Halloween problem protection:

Simple enough so far. The scan is against a memory optimized table variable, and the filter to the left our our seeks and scans check for a change to our value. Nothing left but to update the index and…

CURVE BALL

Wait, what’s all this? We have a Split operator after our Clustered Index Update. SQL Server does sometime turn an UPDATE statement into effectively a DELETE and INSERT if the row needs to move, but this seems a bit much. We have a total of 4 index update/delete operators now, and they aren’t cheap.

My very simple addition to the WHERE clause actually caused a small increase in duration, and a big jump in CPU. So what’s going on?

Read on to see the cause and what Jared was able to do about it.

Comments closed

NVARCHAR Everywhere

I get to put on my contrarian hat:

In the last episode of Shop Talk, I laid out an opinion which was…not well received. So I wanted to take some time and walk through my thinking a little more cogently than I was able to do during Shop Talk.

Here’s the short version. When you create a table and need a string column, you have a couple options available: VARCHAR and NVARCHAR. Let’s say that you’re a developer creating a table to store this string data. Do you choose VARCHAR or NVARCHAR? The classic answer is, “It depends.” And so I talk about why that is in video format right below these words.

I have a video which goes into detail, plus a bunch of words. Plus mice and banjos. 🐭🪕

Comments closed

Uses of ROLLUP and CUBE

Greg Dodd wraps up a series on ROLLUP and CUBE:

I don’t think ROLLUP has a great use in the real world (feel free to comment below if you have a real use for it). The best I can see is that you have SQL Server emailing the results of a query to someone and they want subtotals. Adding totals can be done by any reporting app, probably more efficiently than SQL will do it, and it’s probably easier to debug there as well.

I do, however, think CUBE has an amazing purpose.

The funny thing is, I disagree and find ROLLUP a lot more useful than CUBE because ROLLUP gives me aggregations along a hierarchy. If I set up my GROUP BY clause correctly, I can see the results of my Product Category –> Product Subcategory –> Product hierarchy quite easily. Yeah, I can also do this in reporting, but if I just need a quick printout or need to do something special with the subtotals, ROLLUP is great. I definitely use GROUPING SETS more than ROLLUP and CUBE combined, but I’ve had production code useROLLUP and, to the best of my recollection, never have I created a procedure or report which is based off of CUBE.

Comments closed