Press "Enter" to skip to content

Category: Syntax

Calculating Weighted Averages in SQL

Lukas Eder shows how you can calculate weighted averages using SQL:

As can be seen, this schema is slightly denormalised as the number of lines per transaction are precalculated in the transactions.lines column. This will turn out to be quite useful for this calculation, but it isn’t strictly necessary.

Now, in the previously linked Stack Overflow question, a report was desired that would calculate:
– An aggregation of sums as provided by the line items
– An aggregation of averages as provided by the transactions

As Lukas points out, doing this in two queries is easy, but doing it in one is sublime.

Comments closed

Performance Testing Simple Scalar UDF Functions

Wayne Sheffield tries out a simple scalar UDF in SQL Server 2019 to see how it performs:

I recently published a post detailing the new Scalar UDF Inlining feature in SQL 2019 here. That post introduced the new feature in a way that I used to compare performance to the other function types, continuing the performance evaluation of functions that I had previously posted here and here. In the Scalar UDF Inlining post, I used a function to strip all non-numeral values from a string, and to return the result. This used the FOR XML output option.

In thinking about how scalar functions are commonly used, I’ve decided to revisit this feature with a simpler function. I will still compare it to all the other types of functions to see how Scalar UDF Inlining compares to the others.

Wayne’s results are music to the product team’s ears, I’m sure.

Comments closed

Comparing STRING_AGG Performance Against FOR XML PATH

Jeffry Schwartz compares the performance of STRING_AGG in SQL Server 2017 versus the tried and true method of FOR XML PATH for string concatenation:

Two different T-SQL methods were used for testing.  The FOR XML PATH solution incorporates the reverse and stuff functions along with FOR XML PATH.  The String_agg solution is much simpler, but is only available on SQL Server 2017 currently.

The FOR XML PATH method has been documented thoroughly over the years in many articles, so these explanations will not be repeated here.  However, two points are worth mentioning.  The first one is that the @strSeparator (in this case ‘ or  ’) string will reside at the end of the constructed string (in this case @strFilter).  Since the last occurrence of @strSeparator must be removed to prevent a syntax error, characters must be dropped from the end of the string.  Although it may seem obvious that four characters (the length of @strSeparator) should be removed, actually only three need be removed because the end of the constructed string is “r” not “ “.  Therefore, the length argument of the left function specification is the length of @strFilter minus three.

I like STRING_AGG a lot, primarily because I can remember it and I can explain it to people relatively new to SQL Server. Read on to see how they both compare in terms of performance.

Comments closed

Benefits of Inline Indexes

Phil Factor covers a topic fairly close to my heart:

Usually, the added features of the CREATE TABLE syntax in new releases of SQL Server are esoteric, and unless you are dealing with memory-optimized tables or other esoteric stuff, they aren’t of great interest. However, the Inline INDEX for both a table and column index has just crept in quietly with SQL Server 2014 (12.x). This was interesting because the SQL Server team back-fitted it to all tables rather than just in-memory OLTP tables for which it was, at the time, found necessary. The new syntax was introduced which allows you to create certain index types inline with the table definition. These could be at column level, concerning just that column, or at the table level, with indexes containing several columns.

Why interesting? This affects multi-statement table functions, user-defined table types, table-valued parameters as well as table variables. It was considered a game-change for table variables because, for a start, it allowed non-unique indexes or explicit clustered indexes to be declared on columns for the first time because you can create indexes on table variables as part of the table definition. Of more significance were the table-level indexes that allowed you to specify multi-column indexes. Previous releases had allowed multi-column primary or unique constraints, but not explicitly named indexes. You still cannot declare an index after the table is created, which is a shame as there are good reasons for being able to do so after a table is stocked with data. Any sort of large import of data into a table that is over-indexed or prematurely-indexed is doomed to crawl rather than to run. I’ll show this later on in this article.

Click through for an analysis of inline indexes themselves as well as how they fit on table variables—something I tend not to do much.

Comments closed

UNCOMPRESS Isn’t DECOMPRESS

Solomon Rutzky strives to solve the question, “What is the UNCOMPRESS function anyhow?”:

With no clear indications of what the UNCOMPRESS function does, we can at least pass in some simple values to see what comes back, and see if we can make sense of the output. For the following tests, please keep in mind that “8-bit” refers to the VARCHARCHAR, and TEXT (deprecated) datatypes. And, “16-bit” refers to the NVARCHARNCHARNTEXT (deprecated),and XML datatypes.

Read on as Solomon figures out what it does and how non-useful it is for anybody nowadays.

Comments closed

ISNUMERIC And Unexpected Results

Jen Stirrup explains why ISNUMERIC isn’t all that great:

I noted that one of the columns failed to convert VARCHAR to DECIMAL.

The error message is below, and it’s usually fairly easy to sort:
Error converting data type varchar to numeric

Normally, I’d use ISNUMERIC to identify the rows that fail to have a value in that column that could be converted to a number. Then, I could identify the value, and then I could replace or exclude it, as required.

However, on this occasion, using ISNUMERIC failed to identify any columns as being non-numeric. 

Click through to see why Jen got this result.

Comments closed

Reference Column Names

Jon Shaulis shows why you want to reference tables when including column names in queries:

If you don’t read the rest of this setup, I want you to take away one thing. 

Always reference your tables with your columns when more than one table is involved in the query!

This post is made primarily with SQL Server in mind, but this behavior is actually ANSI SQL and can be replicated in PostgreSQL, MySQL, and Oracle.

Jon’s example is a case where perfectly valid ANSI SQL logic (which is why you can replicate this across platforms and why it’s not a bug) leads to an unexpected result.

Comments closed

Finding The Last Non-Null Value With Snowflake

Koen Verbeeck shows how two words makes solving a problem with Snowflake a lot easier than with SQL Server:

Sometimes you need to find the previous value in a column. Easy enough, the LAG window function makes this a breeze (available since SQL Server 2012). But what if the previous value cannot be null? You can pass a default, but we actually need the previous value that was not null, even if it is a few rows back. This makes it a bit harder. T-SQL guru Itzik Ben-Gan has written about the solution to this problem: The Last non NULL Puzzle. It’s a bit of tricky solution. 

Click through for the magic words and if you’re on the SQL Server side, upvote this issue to get that functionality in SQL Server too.

Comments closed