Press "Enter" to skip to content

Category: Syntax

The Power Of Window Functions

Ben Richardson has an introduction to the concept of window functions:

This is a much more efficient result. In the first line of the script the id, name and gender columns are retrieved. These columns do not contain any aggregated results.

Next, for the columns that contain aggregated results, we simply specify the aggregated function, followed by the OVER clause and then within the parenthesis we specify the PARTITION BY clause followed by the name of the column that we want our results to be partitioned as shown below.

This post focuses on normal aggregates which accept windows.  Once you understand that, there’s a wide world beyond it, and you’ll quickly see how useful window functions can be.

Comments closed

Visualizing Types Of Joins

Daniel Hutmacher has a nice PDF cheat sheet showing different types of joins:

How it works: For each join example, there are two tables, the left and the right table, shown as two columns. For the sake of simplicity, these tables are called “a” and “b” respectively in the code.

You’ll notice that the sheet uses a kind of pseudo-code when it comes to table names and column names.

Click through for the PDF.  I appreciate that he did not use Venn diagrams, as those should be saved for Union/Intersect/Except.

Comments closed

The Joy Of OUTPUT

Mark Wilkinson shows off some fun stuff you can do with the OUTPUT clause:

A common command in the Linux world is the tee command. What tee allows you to do is pipe the output of a command to a file as well as the console. This same functionality can be implemented using multiple OUTPUT clauses in a T-SQL statement. In this example we are going to update a few hundred records. When the update statement is run, not only will it update the MyGuid table but it will update a log table and also return the result of the update. This is accomplished by using two OUTPUT clauses.

Check it out.  I don’t use OUTPUT that often, but it can be quite useful when in a pinch or if you want to prevent scanning a table twice.

Comments closed

Using JSON_MODIFY To Modify Existing JSON

Jovan Popovic shows off the JSON_MODIFY function in SQL Server:

Recently I found this question on stack overflow. The problem was in appending a new JSON object to the existing JSON array:

UPDATE TheTable
SET TheJSON = JSON_MODIFY(TheJSON, 'append $', N'{"id": 3, "name": "Three"}')
WHERE Condition = 1;

JSON_MODIFY function should take the array value from TheJSON column (the first argument), append the third argument into the first argument, and write the appended array back in TheJSON column.

However, the unexpected results in this case is the fact that JSON_MODIFY didn’t appended a JSON object {"id": 3, "name": "Three"}to the array. Instead, JSON_MODIFY appended a new JSON string literal  "{\"id\": 3, \"name\": \"Three\"}" to the end of the array.

This might be surprising result if you don’t know how JSON_MODIFY function works.

Read on to see how JSON_MODIFY works and why this doesn’t quite do what the poster thought.

Comments closed

Data Type Conversions In 4 Database Systems

Eleni Markou has samples for converting strings to dates, numerals, or currency in SQL Server, Postgres, Redshift, and BigQuery:

The TO_DATE function in PostgreSQL is used to converting strings into dates. Its syntax is TO_DATE(text, text) and the return type is a date.

In contrast with MS SQL Server which has strictly specified date formats, in Redshift, any format constructed using the patterns of the table found in the corresponding documentation can be correctly interpreted.

When using the TO_DATE() one has to pay attention as even if an invalid date is passed, it will convert it into a nominally valid date without raising any error.

There are a few other tricks in SQL Server for some of these (for example, on 2012 or newer, I’d use TRY_CONVERT rather than CONVERT).  That said, it’s a good overview of how to translate skills in one relational system to another.

Comments closed

Grouping Data With ROLLUP

Steve Jones shows how easy it is to use the ROLLUP function:

I was editing an article recently that talked about ROLLUP, and I wanted to play with this a bit more. I hadn’t used this much in my career, but it’s a handy operator that’s worth including in your toolkit.

ROLLUP is used to provide additional totals for your aggregates while using GROUP BY. Here’s a little example. I’ve got some data for American Football quaterbacks. In this case, I’ve extracted some stats for a few noteworthy players today.

I’ll probably end up using ROLLUP about once every three months and be very pleased when I remember that it exists.  I use GROUPING SETS more often and almost never use CUBE.

Comments closed

Persisting Computed Columns

Greg Low describes persisted computed columns:

Each time the value from that column is queried, the calculation is performed so the result can be returned. This makes sense when the value is changing regularly and the value is queried infrequently.

However, according to my completely subjective statistics, most computed columns are queried much more than they are ever changed. So why work the value out each and every time?

One really nice thing about persisted computed columns is that you can then build non-clustered indexes using these columns.  It’s a great way of pre-computing work that you need to do often but which would violate rules of database normalization.

Comments closed

Updating Data In Common Table Expressions

Kenneth Fisher shows that you can directly update a table referenced in a common table expression:

CTEs are cool things. You can essentially create one or more in-line view(s) within your query. One thing that isn’t overly well known is that you can actually update the data within the CTE. No, I don’t mean using using the UPDATE statement with a CTE but actually running the update through the CTE.

This is really powerful when combined with window functions, like only updating the first record given a particular partition.  You can also delete, which makes duplicate detection and deletion fairly straightforward.

Comments closed

Window Function Basics

Doug Kline has a new series on window functions.  First, he looks at differences between RANK, DENSE_RANK, and ROW_NUMBER:

— Quick! What’s the difference between RANK, DENSE_RANK, and ROW_NUMBER?

— in short, they are only different when there are ties…

— here’s a table that will help show the difference
— between the ranking functions

— note the [Score] column,
— it will be the basis of the ranking

Then, he starts looking at how to build a window function, starting with the OVER clause:

— here’s a simple SELECT statement from the Products table

SELECT ProductName,
UnitPrice
FROM Products
ORDER BY UnitPrice DESC

— this shows that the highest priced product is Cote de Blaye, productID 38

— but sometimes the *relative* price is more important than the actual price
— in other words, we want to know how products *rank*, based on price

Doug’s entire posts are T-SQL scripts along with embedded videos.

Comments closed