Press "Enter" to skip to content

Category: T-SQL

Building posexplode() in the Serverless SQL Pool

Jovan Popvic rides to the rescue with JSON:

The array cells are pivoted and returned as simple scalar columns. Now you can simply use WHERE or GROUP BY clauses to filter or summarize information by array element values. Another very useful piece of information might be the index of every element (generated as pos column).

Spark enables you to use the posexplode() function on every array cell. The posexplode() function will transform a single array element into a set of rows where each row represents one value in the array and the index of that array element. As a result, one row with the array containing three elements will be transformed into three rows containing scalar cells. This flattened/normalized representation is much easier for the analysis.

Once the array is flattened and normalized, you can easily analyze the data and find how much people knowing SQL or Java.

Read on to see how you can implement the equivalent of POSEXPLODE() using OPENJSON() in the Azure Synapse Analytics serverless SQL pool.

Comments closed

Processing Data: Aggregate and Join Operators

Erik Darling continues a series on understanding plains. First up is aggregate operators:

Aggregates can be useful for all sorts of things in a query plan, and can show up in many different forms.

It would be tough to cover all of them in a single post, but what I’d like to do is help all you nice folks out there understand some of their finer points.

Then we have join operators:

Anyone who tells you there are only three types of joins in SQL Server isn’t your friend.

Okay, maybe that’s harsh. Maybe they’re just getting you prepared for the bonne promenade through all the many different faces a join can wear in your query plans.

Maybe they have a great explanation for Grace Hash Joins in their back pocket that they’re waiting to throw in your face like a bunch of glitter.

Do read both.

Comments closed

DirectQuery and SQL Query Limitations

Chris Webb lays out the limits:

A few days ago I was tagged on a thread on Twitter: my old pal Mim was upset to find that he couldn’t use a SQL query with a Common Table Expression as the source of a table in DirectQuery mode. He worked out why without my help but I thought that it was worth me writing up an explanation for other people who might be struggling with the same problem.

That’s a fairly frustrating limitation, even if you have control of the database you’re querying.

Comments closed

Scoring Azure ML Models in Azure Synapse Analytics

Alex Aleksandrov shows off the PREDICT operator:

We can use Synapse for many activities. We can use it not only for ingesting, querying, storing and visualising data, but for developing machine learning models as well. Of course, one can say that doing data science is another functionality of this platform and this is definitely true. However, in this article, I would like to show you that instead of using Python, one can use T-SQL for doing predictions.

Click through to see how.

Comments closed

Top Value per Group: Window Function or APPLY

Erik Darling hits one of my favorite topics:

The first rule of rewrites is that they have to produce the same results, of course. Logical equivalency is tough.

In today and tomorrow’s posts I’m going to compare a couple different scenarios to get the top value.

There are additional ways to rewrite queries like this, of course, but I’m going to show you the most common anti-pattern I see, and the most common solution that tends to work better.

Click through to see when each works better.

Comments closed

TRANSLATE() and REPLACE() in SQL Server

Kevin Wilkie compares a couple of functions:

There is another function within SQL Server that many people think does the same thing, but with a slight nuance.

Sometimes, you just need to change out one character with another. For example, you need to make a “(” into a “[” to make everything consistent.

I’ve probably used REPLACE() two (or three?) orders of magnitude more often than TRANSLATE() but that’s mostly because I keep forgetting what the latter does.

Comments closed

Filtered Indexes and Functions

Eitan Blumin looks at filtered indexes:

In fact, absolutely no functions of any kind can be used within the WHERE clause of a filtered index. Not even schema-bound user-defined scalar functions.

Unfortunately, as stated in the Microsoft Docs page about Filtered Indexes, the WHERE clause of a filtered index can only support simple comparison operators.

Well, it’s not entirely true, as you CAN actually use some functions, but on two conditions:

Read the whole thing. Eitan lays out one limitation of filtered indexes and provides a couple of potential workarounds.

Comments closed

Solutions for Matching Supply with Demand

Itzik Ben-Gan continues reviewing solutions to a tricky problem:

Last month I covered a solution based on interval intersections, using a classic predicate-based interval intersection test. I’ll refer to that solution as classic intersections. The classic interval intersections approach results in a plan with quadratic scaling (N^2). I demonstrated its poor performance against sample inputs ranging from 100K to 400K rows. It took the solution 931 seconds to complete against the 400K-row input! This month I’ll start by briefly reminding you of last month’s solution and why it scales and performs so badly. I’ll then introduce an approach based on a revision to the interval intersection test. This approach was used by Luca, Kamil, and possibly also Daniel, and it enables a solution with much better performance and scaling. I’ll refer to that solution as revised intersections.

Read on for one class of solution which performed quite well.

Comments closed

Dynamic SQL No-Go

Kenneth Fisher can’t go in dynamic SQL and neither can you:

This is one of those things that when I look back on it seems really obvious. Note: If at the end of this it isn’t overly obvious to you that’s ok too. I do a lot of dynamic SQL and GO is one of my favorite commands.

Read on to understand why. I was going to “One minor clarification…” Kenneth about it being an SSMS command (implying that it’s not available elsewhere) but he successfully parried the attack en passant.

1 Comment

Handling Optional Parameters in SQL Server

Erik Darling embraces dynamic SQL:

You’ve got tables. So many tables And columns. So many columns.

Users — demanding as they are — might wanna see, search, and order by all sorts of things in those many columns in those many tables. Writing a query to do that is hard at first, especially if you’re afraid of dynamic SQL.

Read on for reasons why dynamic SQL is usually the right answer here and check out a video as well.

Comments closed