Category: T-SQL

CTEs Don’t Control Plan Shape

Published 2020-06-26 by Kevin Feasel

I’ve heard many times incorrectly over the years that CTEs somehow materialize data.
But a new one to me was that CTEs execute procedurally, and you could use that to influence plan shapes by always doing certain things first.
Unfortunately, that’s not true of them either, even when you use TOP.

Read the whole thing. Though I do chain common table expressions for readability’s sake, but that’s usually because I’m performing a series of repetitive calculations that I can’t simplify via APPLY.

Comments closed

String Aggregation in T-SQL

Published 2020-06-24 by Kevin Feasel

Jess Pomfret takes us through the STRING_AGG() function:

I’m a SQL Server Database Engineer by day, but I must say my blog has a lot more PowerShell and automation posts than T-SQL. However, last week I found a really great T-SQL aggregate function that I had no idea existed, so I thought I’d share it with you.

It’s a good function.

Comments closed

I Remember Halloween

Published 2020-06-22 by Kevin Feasel

Jared Poche experiences Halloween problem protection:

Simple enough so far. The scan is against a memory optimized table variable, and the filter to the left our our seeks and scans check for a change to our value. Nothing left but to update the index and…
CURVE BALL
Wait, what’s all this? We have a Split operator after our Clustered Index Update. SQL Server does sometime turn an UPDATE statement into effectively a DELETE and INSERT if the row needs to move, but this seems a bit much. We have a total of 4 index update/delete operators now, and they aren’t cheap.
My very simple addition to the WHERE clause actually caused a small increase in duration, and a big jump in CPU. So what’s going on?

Read on to see the cause and what Jared was able to do about it.

Comments closed

NVARCHAR Everywhere

Published 2020-06-19 by Kevin Feasel

I get to put on my contrarian hat:

In the last episode of Shop Talk, I laid out an opinion which was…not well received. So I wanted to take some time and walk through my thinking a little more cogently than I was able to do during Shop Talk.
Here’s the short version. When you create a table and need a string column, you have a couple options available: VARCHAR and NVARCHAR. Let’s say that you’re a developer creating a table to store this string data. Do you choose VARCHAR or NVARCHAR? The classic answer is, “It depends.” And so I talk about why that is in video format right below these words.

I have a video which goes into detail, plus a bunch of words. Plus mice and banjos. 🐭🪕

Comments closed

Uses of ROLLUP and CUBE

Published 2020-06-18 by Kevin Feasel

Greg Dodd wraps up a series on ROLLUP and CUBE:

I don’t think ROLLUP has a great use in the real world (feel free to comment below if you have a real use for it). The best I can see is that you have SQL Server emailing the results of a query to someone and they want subtotals. Adding totals can be done by any reporting app, probably more efficiently than SQL will do it, and it’s probably easier to debug there as well.
I do, however, think CUBE has an amazing purpose.

The funny thing is, I disagree and find ROLLUP a lot more useful than CUBE because ROLLUP gives me aggregations along a hierarchy. If I set up my GROUP BY clause correctly, I can see the results of my Product Category –> Product Subcategory –> Product hierarchy quite easily. Yeah, I can also do this in reporting, but if I just need a quick printout or need to do something special with the subtotals, ROLLUP is great. I definitely use GROUPING SETS more than ROLLUP and CUBE combined, but I’ve had production code useROLLUP and, to the best of my recollection, never have I created a procedure or report which is based off of CUBE.

Comments closed

An Overview of the T-SQL Script DOM

Published 2020-06-08 by Kevin Feasel

Dan Guzman provides a public service:

Scripts are parsed by invoking the Parse method of T-SQL script DOM library TSqlParser class. The parser understands the complex T-SQL abstract syntax tree and splits T-SQL source into atomic TSqlParserTokens of TSqlTokenTypes that represent keywords, identifiers, punctuation, literals, whitespace, etc. These low-level tokens are grouped into more meaningful TSqlFragm ent objects that represent language elements of the script DOM, such as batches, statements, clauses, etc. Fragments, rather than the low-level parser tokens, are most often used in practice, although the underlying tokens are available for specialized requirements
The Parse method returns a TSqlFragment object of type TSqlScript containing all fragments within the script. This top-level fragment of the DOM hierarchy provides programmatic access to all language element fragments in the script. Nearly 1,000 different fragment types exist today due to the many granular T-SQL language elements.

Dan provides several examples of how to use the script DOM, making this a must-read if you’re interested in writing code around SQL Server.

Comments closed

Updated SQL Server Diagnostic Queries

Published 2020-06-03 by Kevin Feasel

Glenn Berry has an updated set of DMV queries for us:

These are my SQL Server Diagnostic Information Queries for June 2020, aka my DMV Diagnostic Queries. They allow you to get a very comprehensive view of the configuration and performance of your SQL Server instance in a short amount of time. There are separate versions of these T-SQL queries for SQL Server 2005 through SQL Server 2019. I also have separate versions for SQL Managed Instance and Azure SQL Database. My diagnostic queries have been used by many people around the world since 2009. I make regular improvements to these queries each month.

This is one of my favorite methods for learning about a new SQL Server instance.

Comments closed

The Rollup and Cube Operators

Published 2020-06-03 by Kevin Feasel

Greg Dodd digs into the ROLLUP and CUBE operators in a two-parter. First, ROLLUP:

As you can see, we now have these null’s popping up, but with totals. Row 5 for example, tells us that in 2017 there were 1,427,461 people living in Hawaii. Row 11 tells us that there are 2,438,188 people living in Rhode Island and Hawaii in 2017. Row 22 tells us that there were 2,429,070 people living in Rhode Island and Hawaii in 2018, and finally row 23 tells us that in total there have been 4,867,268 people in 2017 and 2018. This last row is a bit useless for this data as the overlap of those people would be huge, but for something like sales data, this number could be useful.

Next, CUBE:

For those with a keen eye you’ll see that I’ve started at row 28 in that screenshot. When we run the GROUP BY without ROLLUP or CUBE we get just 16 rows. With ROLLUP that grows to 23, but with CUBE it explodes out to 57. Why?

I’ve used ROLLUP several times with proper hierarchical data (e.g., product category, product sub-category, product) and it does an excellent job of summarizing that sort of data. CUBE has always returned too many rows for my liking. But the operator I go to most frequently is GROUPING SETS, as then I get to control the levels.

Comments closed

Table Variables Live in tempdb

Published 2020-05-26 by Kevin Feasel

Robert French proves that table variables are not memory-only objects:

When I talk to people about table variables one of the items of advice that I give them, is that table variables are NOT created in memory. Just like temporary tables are created in TempDB. There might be some people reading this which disagree with me. So, let’s run some code.

Click through for a demonstration.

Comments closed

Notes on Derived Tables

Published 2020-05-14 by Kevin Feasel

Itzik Ben-Gan continues a series on table expressions:

The term derived table is used in SQL and T-SQL with more than one meaning. So first I want to make it clear which one I’m referring to in this article. I’m referring to a specific language construct that you define typically, but not only, in the FROM clause of an outer query. I’ll provide the syntax for this construct shortly.
The more general use of the term derived table in SQL is the counterpart to a derived relation from relational theory. A derived relation is a result relation that is derived from one or more input base relations, by applying relational operators from relational algebra like projection, intersection and others to those base relations. Similarly, in the general sense, a derived table in SQL is a result table that is derived from one or more base tables, by evaluating expressions against those input base tables.

There’s a lot to digest in this post, so check it out.

Comments closed