Press "Enter" to skip to content

Category: Syntax

Window Function Execution Plans with RANGE

Hugo Kornelis continues a series on explaining the execution plans for window functions:

This is part twenty-six of the plansplaining series. And already the fourth episode about window functions. The first of those posts covered basic window functions; the second post focused on fast-track optimization for running aggregates, and the third post explained how the optimizer works around the lack of execution plan support for UNBOUNDED FOLLOWING.

But all of those were about OVER specifications that use the ROWS keyword. Let’s now look at the alternative, the RANGE keyword.

Click through to see how the various options work with RANGE. By the way, I still want range intervals, like how Postgres implements them, where you can define an interval of X days/hours/minutes/whatever rather than a specific number of rows. Maybe one of these versions…

Comments closed

Fun with WAITFOR

Aaron Bertrand plays red light, green light:

WAITFOR is a very useful command to prevent further operations from occurring until a certain amount of time has passed (WAITFOR DELAY) or until the clock strikes a specific time (WAITFOR TIME). These commands are covered quite well (including pros and cons) in a previous tip, “SQL WAITFOR Command to Delay SQL Code Execution.”

WAITFOR DELAY is also commonly used to “let SQL Server breathe.” For example, when performing some operations in a loop, on a resource-constrained system, an artificial delay can potentially allow for more concurrency, even if it makes the intensive operation take longer.

But these commands can also be used to synchronize multiple query windows for local testing and debugging, and I thought I would share a recent example.

Click through for some of the ways you can use WAITFOR in your scripts.

Comments closed

Microsoft Fabric SQL Endpoints and REST API

Tomaz Kastrun continues a series on Microsoft Fabric. Day 6 covers the SQL Analytics endpoint:

SQL Analytics endpoint in lakehouse is a SQL-based experience for lakehouse delta tables. By using standard T-SQL language, you can write queries to analyze data in delta tables, create functions, procedures, views and even apply security over the objects. There are some of the functionalities missing from your standard T-SQL language, but the experience is the same.

Besides the SQL experience, you can always use the corresponding items in the workspace view of Lakehouse explorer, use SQL in notebooks, or simply use SQL analytics endpoint

Day 7 looks at what subset of T-SQL syntax you can use against SQL Analytics endpoints:

You get the gist, and there are some other limitations; computed columns, indexed views, any kind of indexes, partitioned tables, triggers, user-defined types, sparse columns, surrogate keys, temporary tables and many more. Essentially, all the commands that are not supported in distributed processing mode.

The most biggest annoyance (!) is case sensitivity! Ughh.. This proves that the SQL operates like API on top of delta tables, which is translated either into PySpark commands or not directly to Spark since Spark is not case-sensitive. So, the first one will work and the second statement will be gracefully terminated.

Day 8 covers the Lakehouse REST API:

Now that we explored the lakehouse through the interface and workspaces, let’s check today, how can we use REST API. Microsoft Fabric Rest API defines a unified endpoint for operations.

Comments closed

Continuing the Advent of Code

Kevin Wilkie has been busy. Here’s Day 1 Part 2:

Today, I want to review part 2 of Day 1 of the Advent of Code series. Hopefully, everyone was able to complete part 1 with no troubles, or at least understood what I did to get there.

For part 2, they added a slight wrinkle to the part 1 puzzle. They spell out the numbers into actual words! How do you find them as well as find the numbers? Well, my friend, let’s go through that process, shall we?

After that is Day 2 Part 1:

On day 2, we are asked to gather data from a series of games and to see which of those are possible given a specific number of dice for a few colors. Fun times!

And then there’s Day 2 Part 2:

Today, we’ll be working on the next in the series using the data and processes that we found yesterday in Day 2 Part 1 – found here.

Thankfully, we were smart when we began working through the data and we have the data for each of our dice in separate tables, so breaking the data apart has definitely paid off! Now we can do just a little bit of work with the data from yesterday and we’ll be ready to give the results!

Comments closed

Joining on Overlapping Date Ranges in T-SQL

Daniel Hutmacher crosses the streams:

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

This problem gets even more challenging if you have the possibility of multiple overlaps and you want to find the combination with the biggest overlap for each individual item.

Comments closed

Grouping By Column Alias

Aaron Bertrand wants a feature:

GROUP BY queries can become overly convoluted if your grouping column is a complex expression. Because of the logical processing order of a query, you’re often forced to repeat such an expression since its alias can’t be used in the GROUP BY clause.

Oracle recently solved this in their 23c release by adding the ability to GROUP BY column_alias. This is such simple but powerful syntax, and I’m hoping we can get SQL Server to follow Oracle’s lead.

This would be a pretty nice feature. Admittedly, the workarounds aren’t that difficult, but this would be a nice quality of life update.

Comments closed

Combining Window Functions and GROUP BY

Andy Brownsword aggregates some data:

We revisited window functions last week for T-SQL Tuesday. As we’re in that area there’s another example I thought was worth exploring. Can we group data whilst applying window functions in the same query?

Andy comes up with a final query that works perfectly fine, but there’s actually an easier answer in terms of code readability: the DISTINCT operator.

SELECT DISTINCT
    FinancialQuarter,
    QuarterAvg = AVG(SalesValue) OVER (PARTITION BY FinancialQuarter),
    YearAvg = AVG(SalesValue) OVER (PARTITION BY FinancialYear)
FROM
    #MonthlySales;

The FinancialQuarter column is unique so we can perform the window operation for averaging sales value over financial quarter and then by financial year. To remove the “duplicate” rows, we run DISTINCT and get the same results.

That said, the execution plan for this is a little more complex, as we have to go through a lazy spool on two separate occasions rather than the one that Andy’s solution comes up with. For sufficiently large datasets, that could make a difference, so as usual, choose the option that works better for your situation.

Comments closed

T-SQL Tuesday 168 Round-Up

Steve Jones lagged a bit:

I didn’t get much of a chance to check out the posts as I was at the PASS Data Community Summit, but I came home and started to work through them.

This was the 8th one I’ve hosted, which makes sense as I’ve taken over managing the party from Adam Machanic and there have been a few places I’ve had to fill in for missing hosts. In any case, here’s the roundup. I’m going in order of the comments as I see them on the blog.

Click through for this month’s list of entrants.

Comments closed

Last Observation Carried Forward in SQL Server 2022

Barney Lawrence shows off a nice enhancement to T-SQL in SQL Server 2022:

With SQL Server 2022 came a much requested additional feature added from the SQL standards – IGNORE_NULLS. You can probably guess what it does. Drop in IGNORE_NULLS after your function and you can blur the non null values over those gaps giving us results like this:

Read on for the pre-2022 version of the query and what it does, versus the version with IGNORE_NULLS specified. This small flag is extremely helpful in time series statistical analysis and I’m glad it’s in SQL Server now.

Comments closed