T-SQL – Page 56 – Curated SQL

Grouping Sets and Groupings

Published 2022-03-30 by Kevin Feasel

Kevin Wilkie has fun with grouping sets:

Let’s look at our dbo.Person1 table that we worked with earlier. Today, I want to find a count of all of the persons in each of the following categories: ZipCode, Gender, and Email Domains. And just for fun, let’s add in there where each of those categories cross – for example, Zipcode and Gender, ZipCode and Email Domain, etc…
Most people would think all kinds of awful thoughts at this point about all of the GROUP BY statements you’ll have to write. For anyone wondering – this is one way to do it. Notice all kinds of UNION statements and I’m sure someone is wondering if that’s truly all of the combinations. And we don’t want to go into the maintenance on this if things do happen to change…

And don’t forget about the GROUPING() function:

Let’s say our business partner asks us to determine which fields are aggregated together. Since we only have 2 fields and a grand total of 15 rows, we could determine this by eye. But, like all good developers, we want to do this programmatically.
Here’s where our friend – the GROUPING() function – comes into play.

GROUPING SETS is an extremely useful operator in the ANSI SQL standard. Definitely worth learning how to use.

Comments closed

Dealing with Shift Times

Published 2022-03-17 by Kevin Feasel

Kenneth Fisher knows what time it is:

One of the more interesting jobs I’ve had over the years was for a company that created emergency room software. It was pretty cool software and I learned a lot, both about writing queries in SQL Server and about how a software company can be run. One of the more interesting things in the various reports we created was the concept of shift calculations. In other words, what happened during a given shift.

I’ve had to do something similar (though it was for nurse scheduling rather than emergency rooms). Things get really tricky when you start dealing with 12-hour and 16-hour shifts, tracking overtime, and the like.

Comments closed

The Power of QUOTENAME

Published 2022-03-16 by Kevin Feasel

Kevin Wilkie unlocks the power of QUOTENAME():

When I first heard about QUOTENAME, I was like “This is rather useless. It just puts brackets around whatever. I can do it just as easily hard-coding the strings.”
Truly, I’m not completely wrong, but it’s a heck of a lot more fun to knock things out with the QUOTENAME function!

But there’s more that you can do with this function, as Kevin notes.

Comments closed

Matching Supply and Demand

Published 2022-03-09 by Kevin Feasel

Itzik Ben-Gan continues looking at interesting solutions to a tricky problem:

Last month I covered solutions based on a revised interval intersections approach compared to the classic one. The fastest of those solutions combined ideas from Kamil, Luca, and Daniel. It unified two queries with disjoint sargable predicates. It took the solution 1.34 seconds to complete against a 400K-row input. That’s not too shabby considering the solution based on the classic interval intersections approach took 931 seconds to complete against the same input. Also recall Joe came up with a brilliant solution that relies on the classic interval intersection approach but optimizes the matching logic by bucketizing intervals based on the largest interval length. With the same 400K-row input, it took Joe’s solution 0.9 seconds to complete. The tricky part about this solution is its performance degrades as the largest interval length increases.
This month I explore fascinating solutions that are faster than the Kamil/Luca/Daniel Revised Intersections solution and are neutral to interval length. The solutions in this article were created by Brian Walker, Peter Larsson, Paul White, and me.

These are some of the best solutions but the whole series has been quite interesting.

Comments closed

The Power of PIVOT and GROUPING SETS

Published 2022-03-08 by Kevin Feasel

Aaron Bertrand builds a report:

Without comprehensive reporting tools (or Excel), it can be cumbersome and frustrating to produce perfect report output from SQL Server SELECT statement or stored procedures. In modern versions, we have access to T-SQL functionality that far exceeds old-school ROLLUP and CUBE, like PIVOT, UNPIVOT, and GROUPING SETS. Let’s look at how to produce output we can easily plug into a simple front end and produce great-looking reports.

GROUPING SETS is one of my favorite under-utilized operators.

Comments closed

Building posexplode() in the Serverless SQL Pool

Published 2022-03-07 by Kevin Feasel

Jovan Popvic rides to the rescue with JSON:

The array cells are pivoted and returned as simple scalar columns. Now you can simply use WHERE or GROUP BY clauses to filter or summarize information by array element values. Another very useful piece of information might be the index of every element (generated as pos column).
Spark enables you to use the posexplode() function on every array cell. The posexplode() function will transform a single array element into a set of rows where each row represents one value in the array and the index of that array element. As a result, one row with the array containing three elements will be transformed into three rows containing scalar cells. This flattened/normalized representation is much easier for the analysis.
Once the array is flattened and normalized, you can easily analyze the data and find how much people knowing SQL or Java.

Read on to see how you can implement the equivalent of POSEXPLODE() using OPENJSON() in the Azure Synapse Analytics serverless SQL pool.

Comments closed

Processing Data: Aggregate and Join Operators

Published 2022-03-07 by Kevin Feasel

Erik Darling continues a series on understanding plains. First up is aggregate operators:

Aggregates can be useful for all sorts of things in a query plan, and can show up in many different forms.
It would be tough to cover all of them in a single post, but what I’d like to do is help all you nice folks out there understand some of their finer points.

Then we have join operators:

Anyone who tells you there are only three types of joins in SQL Server isn’t your friend.
Okay, maybe that’s harsh. Maybe they’re just getting you prepared for the bonne promenade through all the many different faces a join can wear in your query plans.
Maybe they have a great explanation for Grace Hash Joins in their back pocket that they’re waiting to throw in your face like a bunch of glitter.

Do read both.

Comments closed

DirectQuery and SQL Query Limitations

Published 2022-03-07 by Kevin Feasel

Chris Webb lays out the limits:

A few days ago I was tagged on a thread on Twitter: my old pal Mim was upset to find that he couldn’t use a SQL query with a Common Table Expression as the source of a table in DirectQuery mode. He worked out why without my help but I thought that it was worth me writing up an explanation for other people who might be struggling with the same problem.

That’s a fairly frustrating limitation, even if you have control of the database you’re querying.

Comments closed

Scoring Azure ML Models in Azure Synapse Analytics

Published 2022-03-03 by Kevin Feasel

Alex Aleksandrov shows off the PREDICT operator:

We can use Synapse for many activities. We can use it not only for ingesting, querying, storing and visualising data, but for developing machine learning models as well. Of course, one can say that doing data science is another functionality of this platform and this is definitely true. However, in this article, I would like to show you that instead of using Python, one can use T-SQL for doing predictions.

Click through to see how.

Comments closed

Top Value per Group: Window Function or APPLY

Published 2022-02-24 by Kevin Feasel

Erik Darling hits one of my favorite topics:

The first rule of rewrites is that they have to produce the same results, of course. Logical equivalency is tough.
In today and tomorrow’s posts I’m going to compare a couple different scenarios to get the top value.
There are additional ways to rewrite queries like this, of course, but I’m going to show you the most common anti-pattern I see, and the most common solution that tends to work better.

Click through to see when each works better.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: T-SQL