Press "Enter" to skip to content

Category: Syntax

Domains in ANSI SQL

Joe Celko describes a domain:

For example, if there though is that there is a domain called voltage which has a base unit called “volt” that’s otherwise meaningless. Yes, you can get a voltmeter you can watch the needle, you can be told what the IEEE specification for defining how much work a volt should do or shock you. I’ve discussed scales and types of measurements in a previous article, It’s worth mentioning that you should not confuse domain with the representation and symbols of the units being used. Some domains are limited, such as degrees that measure planar angles. An angle can be from 0 to 360°, or it can be between zero and 2π radians.

Joe has an explanation but doesn’t have any concrete examples in psql. Here’s one from the PostgreSQL documentation:

CREATE DOMAIN us_postal_code AS TEXT
CHECK(
   VALUE ~ '^\d{5}$'
OR VALUE ~ '^\d{5}-\d{4}$'
);

The idea of a domain here is that you define a valid slice of some data type. We can do something similar with check constraints on an attribute, but the difference is that we’d need to create the check constraint for each relevant attribute, whereas the domain would include this check automatically, making it quite useful if we have multiple instances of, say, us_postal_code in our database. Then, we wouldn’t need to worry about creating a check constraint on each instance and ensuring that the code remains the same across the board.

This also leads to a very common sentiment in functional programming: make invalid states unrepresentable. In other words, make it impossible for a person or piece of code to generate a result in an invalid state. By defining a domain with the scope of our valid state, we make it impossible for someone to create a US postal code value that does not pass our check, and so we can’t have dirty data of this sort in our database.

Leave a Comment

COPY and \COPY in PostgreSQL

Dave Stokes runs two commands:

PostgreSQL is equivalent to a Swiss Army Knife in the database world. There are things in PostgreSQL that are very simple to use, while in another database, they take many more steps to accomplish. But sometimes, the knife has too many blades, which can cause confusion. This is one of those cases.

Read on to understand what the difference is between these two commands.

Leave a Comment

Linting SQL with SQLFluff

Josephine Bush busts out a linter:

I thought I didn’t care about linting, and lately, I haven’t written a lot of SQL, but for the SQL I do write, I have SQLFluff to help me format it. A friend of mine is big into SQLFluff and finally talked me into installing and using it. For more information about SQLFluff itself, visit here.

Josephine shows off some of the configuration for PostgreSQL’s psql as well as SQL Server’s T-SQL.

Leave a Comment

Regular Expression Matches in PostgreSQL

Tobias McNulty now has two problems:

regexp_matches() and regexp_match() are two similar string functions that support regular expression matching directly in the PostgreSQL database. regexp_matches() was added in PostgreSQL 8.3, and regexp_match() was added in PostgreSQL 10 (keep reading to see how ChatGPT struggled to answer this question).

Read on for that as well as plenty more information on how the two work, and even a bonus snippet on another regular expression function.

Leave a Comment

RIGHT OUTER JOIN and MERGE

Lukas Eder merges lanes:

It’s not unreasonable to expect these two statements to produce the same execution plan on most RDBMS, given that they’re logically equivalent. Since we’ve grown used to reading things from left to right and top to bottom, I don’t think RIGHT JOIN will become more popular any time soon.

There is, however, one place in the SQL language where RIGHT JOIN is surprisingly ubiquitous!

Read on to see how the MERGE operator includes as part of its operations the equivalent of a RIGHT OUTER JOIN. MERGE in SQL Server had justifiably built a bad reputation in its early years, but it’s surprisingly okay nowadays, with most of the bugs being fixed over time.

Leave a Comment

Performance Comparison: Tally Table vs GENERATE_SERIES()

Steve Jones performs a pair of tests:

I had someone reach out about generate_series() recently, saying they hadn’t realized this was a new feature in SQL Server 2022. They were wondering if it was better than using a tally table.

I didn’t want to do an exhaustive test, but I thought I’d take a minute and try a couple simple things just to see.

Steve used the CTE-based tally table builder, building based on cross joining spt_values. This is one of the classic approaches. The performance differences aren’t enough on their own to justify large-scale changes if you’re using a classical tally table, though it is good to see that GENERATE_SERIES() does perform well. And if you’re not familiar with the power of a tally table, here is one great explanation of the concept.

Comments closed

Random Functions in PostgreSQL 17

Leo Hsu and Regina Obe look at updates to the random() function in PostgreSQL:

Have you ever wanted to get a random integer between 1 and 10 and been a little annoyed the slightly cryptic code you had to write in PostgreSQL? PostgreSQL 17 random functions make that simpler. Sometimes it’s the small changes that bring the most joy.

Click through to see what it took to get a random integer or floating point number prior to PostgreSQL 17 and how it’s a fair bit simpler today.

Comments closed

Removing Unnecessary Delete Operations

Monica Rathbun truncates some data:

SQL Server performance issues often stem from easy fix bottle necks that can be fixed with the right tuning strategies. This short blog will focus on the DELETE statement.

Click through for the full story. One minor correction that I’d offer is that the TRUNCATE TABLE command is logged, which is why you can roll it back in a transaction. The logging process is much less intensive than with DELETE because of the deferred drop logic that Paul talks about in the link.

That aside, Monica’s key point is absolutely correct: DELETE operations tend to be quite slow, especially as the number of records you need to delete increases. This is where techniques like batching delete operations can help reduce the pain level, but if you’re deleting every (or almost every) row from a table, there’s probably a better method. Unless replication is involved in the mix—in that case, there are no good methods for anything because replication hates us.

Comments closed

Sales KPIs in T-SQL

I have a new video:

In this video, we will take a look at several useful KPIs for measuring sales. We will also demonstrate the utility of the APPLY operator, the WINDOW clause, and the APPROX_PERCENTILE_CONT() function.

I enjoyed putting this one together, because I enjoy it any time I have a chance to talk about the APPLY operator.

Comments closed