Press "Enter" to skip to content

Category: Syntax

Using the WINDOW Clause in SQL Server 2022

Hasan Savran peeks through the windows:

WINDOW Operations in SQL Server can be hard to understand.  I believe one of the main reasons for that is the long and repeated code it needs. SELECT…WINDOW Clause will help us to remove repeated code and hopefully it will make the WINDOW Operations more user-friendly.

I am really happy about this syntax change, as often times, we have to re-use the same window frame (PARTITION BY and ORDER BY) or a base frame with a minor change (add one more column to the end). This removes the kind of repetition which makes queries harder to read and introduces subtle bugs.

Comments closed

Creating Human-Readable Intervals from Lists in T-SQL

Daniel Hutmacher is making a list and checking it twice:

If you’ve worked with reporting, you’ve probably come across the following problem. You have a list of values, say “A, B, C, D, K, L, M, N, R, S, T, U, Z” that you want to display in a more user-friendly, condensed manner, “A-D, K-N, R-U, Z”.

Today, we’re going to look at how you can accomplish this in T-SQL, and what this has to do with window functions and gaps and islands.

Read on for a really good demonstration of how powerful window functions and STRING_AGG() can be.

Comments closed

The Siren Song of Platform-Portable SQL

Shane O’Neill gets on the soapbox:

Part of why I’m ranting about database-ubiquitousness SQL instead of ORMs is that I’ve also witnessed SQL code that has to work everywhere

I’m going to call this SQL “DUh SQL” from now on since I only have so many vowels on my keyboard and shouldn’t waste them re-writing the word “ubiquitousness”

Them: “Nope, we can’t use DATEFROMPARTS; it’s not available in other databases! Yeah, we use GETDATE() everywhere; what’s your point? WHAT!?!”

I strongly agree with this. Lowest common denominator SQL just means your code is likely to perform poorly in all environments. Even if you know you need to support multiple platforms, it’s better to get the most out of each and have whichever data access layer implementation you inject figure out the details of how to do it.

Comments closed

The Performance Pain of User-Defined Functions

Tom Zika continues a series on why user-defined scalar functions are such a bad idea:

I’ll cover several test scenarios and analyze the performance using different monitoring tools.
The results will be for the second executions of the queries, so we have compiled and cached plans and all pages in the buffer pool.

Even if you already hate seeing scalar user-defined functions in code, the occasional reminder of how poorly they perform helps focus the mind.

Comments closed

SET, SELECT, and Variable Assignment

Chad Callihan obliquely reminds us to create those unique constraints (by way of unique indexes):

Did you know there is more than one way to set a variable in SQL Server? You can actually set a variable without using “SET” at all. Let’s look at an example that shows how setting a variable with SELECT can cause a headache when dealing with identical values.

Click through to see the problem in action. One way around this if you do know you are dealing with duplicates and need a specific one is to SELECT TOP(1) with an appropriate ORDER BY clause, just as you would if variable assignment weren’t on the table.

Comments closed

Lack of Fun with Scalar Functions

Tom Zika takes away the scalars:

I’m still surprised many people don’t realise how lousy Scalar functions are. So because it’s my current focus in work and this Stack Overflow question, I’ll be revisiting this topic.

The focus of part one is parallelism. Unfortunately, parallelism often gets a bad rep because of the prominent wait stats. Also, if there is a skew, it can run slow. But for the most part, it’s advantageous.

Whether or not you want parallelism should be an informed choice. But Scalar functions will enforce the query to run serially, even if you are unaware. That’s why I want to shine a light on this.

Read on for a demo of how even a no-op scalar function can affect query performance. Given the mess we normally see in scalar functions, it’s all downhill from there.

Comments closed

Query Store Hints in SQL Server 2022

Erik Darling has thoughts:

When you’re dealing with untouchable vendor code full of mistakes, ORM queries that God has turned away from, and other queries that for some reason can’t be tinkered with, we used to not have a lot of options.

In SQL Server 2022, Query Store gains a new super power: you can add hints to queries without intercepting the code in some other manner.

There are a couple of useful hints which won’t be available but Erik seems mostly upbeat about what is there.

Comments closed

When SELECT * Doesn’t

Chad Callihan protects the reputation of SELECT *:

There are plenty of scenarios where using SELECT * can be an issue. Using SELECT * with EXISTS isn’t one of them.

When using EXISTS and SELECT * together, SQL Server is smart enough to realize what you’re doing and knows you don’t care about what’s in the SELECT.

Read on for an example. I’ve trained myself (been trained?) still to use SELECT 1. The reason is, I know SELECT * works exactly the same way but the benefit of using SELECT 1 is that doing this consistently allows you to do a search for SELECT * in your code base to find actual perpetrators (people writing queries expecting to return the entire result set and which may be susceptible to performance problems or future maintainability problems). Using SELECT 1 in the EXISTS clause means you get fewer false positives in that search as a result.

That said, Joe Celko chimes in to provide some of the history behind SELECT * as the convention for references in the EXISTS clause.

Comments closed

Using DATE_BUCKET() in SQL Server

Hasan Savran starts bucketing:

The Date_Bucket function is introduced in Azure SQL Edge which is mainly used by IoT devices. This useful function returns the date-time value corresponding to the start of each date-time bucket from the timestamp defined by the origin parameter, or the default origin value of 1900-01-01 00:00:00.000. In other words, it lets you arrange data into groups that represent fixed intervals of time. SQL Server 2022 includes this useful function in its database engine.

Results of Data_Bucket might be confusing, Let’s look at its syntax first.

Hasan is quite right here: the results of DATE_BUCKET() are not intuitive, though they do make some sense…eventually…

Comments closed