Press "Enter" to skip to content

Category: Syntax

Solving Groups of Groups Problems with Window Functions

James McGillivray shows off some of the power of window functions:

Windowing functions are an underused feature in SQL Server, with myriad uses. The most common problems we solve are running totals, seeing group totals on the same line as individual lines from the group (allowing calculations like subtotal %). I don’t know if there are better solutions, but before Windowing Functions, I used to solve these kind of problems with self joins, or nested queries. The performance of Windowing Functions is significantly better than this approach, and that alone has made my life considerably better.

The more I learn about Windowing Functions, the more often I see use cases where they are useful.

I love talking about, and teaching people to use, Windowing functions to make their lives better, so I’m quite excited to be able to use them as the topic for today’s post.

Specifically, James looks at groups of groups problems and aggregates of aggregates problems.

Comments closed

RESULT_SCAN() in Snowflake

Koen Verbeeck introduces us to the RESULT_SCAN() function in Snowflake DB:

I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.

This post builds upon part 6 of the series, which dealt with query history. There it is explained how Snowflake caches the query results. You can find a query in the history and take a look at what was returned. Using the RESULT_SCAN table function, you can do this with SQL. Let’s take a look at an example.

This is an interesting function. Click through to see it in action.

Comments closed

Splitting Strings with T-SQL

Andy Levy recommends using STRING_SPLIT():

Last year, you finally retired the last of your SQL Server 2008R2 instances. Congratulations! But are you taking advantage of everything that your new instances have to offer? Unless you did a review of all of the T-SQL in your applications, I’m guessing not.

At one time or another, we all find ourselves having to do some string parsing, especially splitting strings on a delimiter. Nearly all of us have one (or two or a dozen) functions for doing this somewhere on every instance of SQL Server. But since SQL Server 2016, we’ve had an official way to do it – the STRING_SPLIT() function.

Andy’s example involves splitting strings, but there are plenty of functions which come into the T-SQL lexicon. It might be worth doing a quick review of the available system functions just to see if there’s something useful which slipped with a newer version of SQL Server.

Comments closed

Solving the Gaps and Islands Set of Problems

Ed Pollack continues a series on gap and island analysis:

Gaps and islands analysis supplies a mechanism to group data organically in ways that a standard GROUP BY cannot provide. Once we know how to perform an analysis and group data into islands, we can extend this into the realm of real data.

For all code examples in this article, we will use a set of baseball data that I’ve created and maintained over the years. This data is ideal for analytics as it is large and contains data quality that varies between very accurate and very sloppy. As a result, we are forced to consider data quality in our work, as well as scrutinize boundary conditions for correctness. This data will be used without much introduction as we will only reference two tables, and each is relatively straightforward.

The code in this article gets a bit complex, but Ed shows off some powerful techniques.

Comments closed

Fun with SET Options

Dan Guzman takes us through different SET options in T-SQL and where you can go wrong:

ANSI_PADDING OFF has also been deprecated for quite some time and the SQL Server documentation specifically calls out “ANSI_PADDING should always be set to on.” In summary, a column-level ANSI_PADDING OFF setting causes nullable fixed-length char(n) and binary(n) columns to behave like variable-length varchar(n) and varbinary(n) columns. Furthermore, SQL Server automatically trims trailing blank characters from character data and leading binary zeros from binary data and stores the values as variable length instead of storing the provided value as-is during inserts and updates. Varchar(n)/varbinary(n) columns with ANSI_PADDING OFF are similarly trimmed. Note that it is the persisted ANSI_NULLS column meta-data setting that determines the storage and trimming behavior, not the current session ANSI_PADDING setting. The session ANSI_PADDING must still be ON when using features that require proper settings.

Some of these will pop up in occasional errors, like if you’re using filtered indexes or indexed views.

Comments closed

Qutoed Data and OPENROWSET

Dave Mason wants to remove quoted identifiers from a flat file:

I haven’t shown all the columns, but you get the idea–every column in the result set has data enclosed in double quotes. That’s exactly how it appears in the source data file.

Dave has a method which works for plenty of versions of SQL Server. If you’re using 2017 or later, the FIELDQUOTE parameter was added to solve this problem, though to be fair, I haven’t actually tried it to see if it works as expected.

Comments closed

Compression and Decompression with SQL Server

Randolph West asks a pair of questions:

On Twitter recently, I asked:

Does anyone I know use the COMPRESS and DECOMPRESS features in T-SQL?

To those who replied in the affirmative, I asked:

What made you decide on this as opposed to ROW or PAGE compression?

Read on to learn the responses. In my case, I’ve seen COMPRESS and DECOMPRESS used in two places. First, compressing large product descriptions (large enough to go to LOB). Second, I use it to compress binary models created via SQL Server Machine Learning Services. Some of those models compress quite nicely.

Comments closed

More with TOP and Blocking Operators

Jared Poche continues an investigation into the TOP operator:

I’ve explained what a blocking operator is and provided a few examples, but maybe this doesn’t seem important. It’s affecting the TOP operator, sure, but don’t people just use this to look at the TOP 1000 rows of their tables in SSMS?

The TOP operator is useful for many operations, especially in a large environment. Large operation can timeout or fail for a variety of reasons, consuming resources without providing the results you need. A small, batch-sized operation is more likely to succeed and tends to perform more consistently. Many maintenance operations make sense to run with a TOP operator, so we should make sure those operations aren’t stymied by blocking operators.

Read on for several examples.

Comments closed

Fun with NULL

Itzik Ben-Gan takes us through some of the complexities of NULL:

NULL handling is one of the trickier aspects of data modeling and data manipulation with SQL. Let’s start with the fact that an attempt to explain exactly what a NULL is is not trivial in and of itself. Even among people who do have a good grasp of relational theory and SQL, you will hear very strong opinions both in favor and against using NULLs in your database. Like them or not, as a database practitioner you often have to deal with them, and given that NULLs do add complexity to your SQL code writing, it’s a good idea to make it a priority to understand them well. This way you can avoid unnecessary bugs and pitfalls.

This article is the first in a series about NULL complexities. I start with coverage of what NULLs are and how they behave in comparisons. I then cover NULL treatment inconsistencies in different language elements. Finally, I cover missing standard features related to NULL handling in T-SQL and suggest alternatives that are available in T-SQL.

This is definitely worth the read.

Comments closed

Character Generation with T-SQL

Bill Fellows shows off what you can do with character generation in T-SQL:

The mod or modulus function will return the remainder after division. Modding a value is a handy way to constrain a value between 0 and an upper threshold. In this case, if I modded any number by 26 (because there are 26 characters in the English alphabet), I’ll get 0 to 25 as my result.

Knowing that the modulus function will give me 0 to 25 and knowing that my target character range starts at 65, I could use the previous expression to print any number’s ascii value like SELECT CHAR((2147483625 % 26) + 65) AS StillB;. Break that apart, we do the modulus, %, which gives us the value of 1 which we then add to the starting offset (65).

He also provides a DB Fiddle for it.

Comments closed