Press "Enter" to skip to content

Category: T-SQL

Controlling Duplicates in T-SQL

Itzik Ben-Gan lays out some fundamentals:

When people start learning a new field, for example T-SQL, it’s tempting to spend very little time trying to understand the fundamentals of the field so that you can quickly get to the advanced parts. You might think that you already understand what certain ideas, concepts and terms mean, so you don’t necessarily see the value in dwelling on them. That’s often the case with newcomers to T-SQL, especially because soon after you start learning the language, you can already write queries that return results, giving you a false impression that it’s a simple or easy language. However, without a good understanding of the foundations and roots of the language, you’re bound to end up writing code that doesn’t mean what you think it means. To be able to write robust and correct T-SQL code, you really want to spend a lot of energy on making sure that you have an in-depth understanding of the fundamentals.

No matter your knowledge level, there’s a really good chance you’ll learn at least one new thing in this article.

Comments closed

SparkSQL CONCAT vs T-SQL CONCAT

Bill Fellows has a public service announcement:

The concat function is super handy in the database world but be aware that the SQL Server one is way better because it solves two problems. It combines everything into a string and it does not require NULL checking. In the before times, one had to down cast to a n/var/char type as well as check for NULL before appending strings via the plus sign.

The point of difference is so important that Bill busted out the marquee HTML tag. Which now leads me to wonder, was marquee or blink the bigger evil in the mid-to-late ’90s web?

Comments closed

Unit Testing Dynamic SQL

Jay Robinson lays out a pattern:

Dynamic SQL (aka Ad Hoc SQL) is SQL code that is generated at runtime. It’s quite common. Nearly every system I’ve supported in the past 30 years uses it to some degree, some more than others.

It can also be a particularly nasty pain point in a lot of systems. It can be a security vulnerability. It can be difficult to troubleshoot. It can be difficult to document. And it can produce some wickedly bad results.

Click through for Jay’s process as well as recommendations and an example. It’s certainly worth thinking about.

Comments closed

WITHIN GROUP in STRING_AGG()

Chad Callihan messes with groups:

When was the last time you wrote a SQL query and knew something was possible but just couldn’t remember how? I had one of those moments this week with STRING_AGG and ordering data, and although it was frustrating, I knew it would make a worthwhile blog post. Let’s look at some examples using STRING_AGG and WITHIN GROUP (aka the clause that slipped my mind).

There’s a perfectly good reason why WITHIN GROUP might slip your mind: STRING_AGG() is known as an ordered set function (versus a window function which uses an OVER() clause). It’s also the only ordered set function SQL Server supports, so you don’t get too many opportunities to use the key phrase.

Comments closed

Dueling Sequences for Positive and Negative Numbers

Jose Manuel Jurado Diaz hears banjo music:

SQL Server’s INT data type, by design, provides a range from -2,147,483,648 to 2,147,483,647. But often, developers only utilize the positive range for primary keys and other identifier fields, effectively wasting half of its potential. What if we could harness this full range to temporarily extend the capacity of an INT column? In this article, we explore this idea in-depth.

I’m of two minds here. On the one hand, surrogate keys don’t have a meaning (by definition!), so it doesn’t really matter if that number is positive or negative. Also, including negative numbers makes sense when you expect the steady state table size to be above 2 billion but below 4 billion rows (assuming that you’re using an INT datatype), or you’ve found out that the steady state size is that big after the fact.

On the other hand, I don’t like having a caller define whether they want positive or negative values, as that now imbues meaning to the surrogate key, where positive keys mean one thing and negative keys mean another.

Comments closed

Alternatives to GREATEST and LEAST in SQL Server 2022

Drupal Grupal gives us alternatives:

If you haven’t already heard, SQL 2022 has introduced a new built-in system function called GREATEST. Simply put, it is to a set of columns, variables, expressions etc. what the MAX function is to a set of values (i.e., rows) of a single column or expression. The opposite of GREATEST function is LEAST function which returns the smallest value from the supplied list of columns, variables, constants etc. GREATEST & LEAST can be considered a pair, just as MIN/MAX functions are.

In other situations, GREATEST() and LEAST() are known as ARGMAX() and ARGMIN(), respectively.

Drupal shows us two alternatives to the built-in function and includes a performance comparison.

Comments closed

TINYINT Casts in Spark SQL vs T-SQL

Bill Fellows runs into an interesting oddity:

Yet another thing that has bitten me working in SparkSQL in Databricks—this time it’s data types.

In SQL Server, a tinyint ranges from 0 to 255 but both of them allow for 256 total values. If you attempt to cast a value that doesn’t fit in that range, you’re going to raise an error.

SQL Server’s TINYINT data type is an unsigned one-byte number, whereas TINYINT in Spark SQL is a signed one-byte number. But that’s not the biggest difference Bill finds, so check out the post to learn more.

Comments closed

String Aggregation with STRING_AGG

Jeff Mlakar uses the STRING_AGG() function:

Often when generating dynamic sql we must add some separator values in the string(s) we build. A function delivered way back in 2017 called STRING_AGG is a simple way to do this. Let’s look at some examples.

There are a lot of people who have never heard of the function and don’t know that there’s a better alternative to the combination of STUFF() and FOR XML PATH to concatenate together strings.

Comments closed

Executing Transactions in PostgreSQL

Salman Ahmed rolls it back:

Transactions, like any other database, are a key component of PostgreSQL. A transaction is a sequence of one or more database operations that are executed as a single unit of work. These operations can be queries (e.g. SELECT, INSERT, UPDATE and DELETE) that modify the database’s state.

A transaction’s main purpose is to combine multiple statements into an atomic, all-or-nothing process. It ensures that either all operations within a transaction are fully completed, or none of them are executed at all. Concurrent transactions cannot see each other’s unfinished changes. Updates from ongoing transactions remain hidden until completion, at which point all changes become visible simultaneously.

This is very similar to SQL Server, except their savepoints actually work they way they’re supposed to.

Comments closed

Using IS DISTINCT FROM in SQL Server 2022

Chad Callihan is distinguished:

One feature introduced with SQL Server 2022 that I’ve recently been playing around with is IS [NOT] DISTINCT FROM. This new feature can help when it comes to dealing with NULL value comparisons.

Read on for examples. Do note that x IS NOT DISTINCT FROM y does not provide a performance benefit over its equivalent of x=y OR (x IS NULL AND y IS NULL).

Comments closed