Press "Enter" to skip to content

Category: Constraints

Adding Foreign Keys and Deadlocks

Michael J. Swart explains a challenge in adding a foreign key to an existing table:

Schema modification locks (SCH-M) are taken by DDL (Data Definition Language) statements like CREATE/ALTER/DROP.
Schema stability locks (SCH-S) are taken by DML (Data Manipulation Language) statements like INSERT/UPDATE/DELETE.

Those two types of locks are incompatible. Meaning, I can’t get a SCH-S lock on some table if you’ve already got a SCH-M lock on it (and vice versa).
Paul Randal describes the SCH-M lock as a super-table-X lock. It makes sense to me, if I’m half way through querying a table, I don’t want its definition to change.

Such a pessimistic lock can be awkward for a busy system. The SCH-M can cause a lot of blocking. For example, creating (and dropping) foreign keys requires a SCH-M lock not only on the parent table, but also on the referenced table which leads to trouble.

Click through for a demonstration of the problem. Michael also has some guidance on how to minimize the issue. I’d note the degenerative form of this guidance: understand your data model up-front and apply foreign key constraints at table creation time. That’s not always possible, sure, so when you can’t do that, Michael has some good advice.

Leave a Comment

Enforcing Constraints across Postgres Partitions

Shaun Thomas explains a rule:

Postgres table partitioning is one of those features that feels like a superpower right up until it isn’t. Just define a partition key, carve up data into manageable chunks, and everything hums along beautifully. And what’s not to love? Partition pruning in query plans, smaller tables, faster maintenance, easy archiving of old data; it’s a smorgasbord of convenience.

Then you try to enforce a unique constraint without including the partition key, and Postgres behaves as if you just asked it to divide by zero. Well… about that.

Click through for an explanation, some workarounds that might work in specific circumstances, and a few closing remarks.

As for SQL Server, the same rule applies. If you want a unique index (which is what a unique key constraint uses under the covers), you must include the partitioning column. If you don’t include it, SQL Server will include it for you rather than giving a hard error.

Comments closed

Word Order and Constraint Naming

Andy Levy is looking for a name:

Ten years (and a couple jobs) ago, I wrote about naming default constraints to avoid having SQL Server name them for you. I closed with the following statement:

SQL Server needs a name for the constraint regardless; it’s worth specifying it yourself.

We’re back with a new wrinkle in the story.

Read on for an interesting scenario where Andy very clearly named a constraint, yet the name didn’t take.

Comments closed

Constraints in PostgreSQL

Gulcin Yildirim Jelinek digs into how PostgreSQL handles database constraints:

Constraints give you fine-grained control over data integrity and if any inserted or default value violates them, PostgreSQL raises an error.

In short, constraints are rules enforced by the database to keep your data valid and consistent. When constraints are not enforced, data issues start to leak in and eventually turn into bugs. Spending time understanding constraints helps prevent subtle data bugs later on.

Read on for information around constraint types in Postgres (including exclusion constraints), as well as triggers and domains. Exclusion constraints are new to me, but apparently allow for things like preventing timeframe overlaps, so that’s pretty useful.

Comments closed

Default Constraints and User-Defined Functions

Erik Darling has a new video. Erik shows how SQL Server handles default constraints that use user-defined functions and how this behaves under a variety of circumstances. There’s also a dive into parallelism and constraints. We also learned Erik’s ability to perform fractional math and how he actually differentiates “scalar” from “scaler,” proving once again that he is not midwestern from his use of extraneous vowel sounds.

1 Comment

Building State Transitions as SQL Constraints

Joe Celko makes a change:

About two decades ago, I introduced the concept of transition constraints to show Data Validation in a database is a lot more complex than seeing if a string parameter really is an integer. In October of 2008, I did an article called Constraint Yourself! on how to use DDL constraints to assure data integrity. One of the topics in that piece was a look at state transition constraints via an auxiliary table.

Read on for an interesting dive into the topic.

Comments closed

Domains in ANSI SQL

Joe Celko describes a domain:

For example, if there though is that there is a domain called voltage which has a base unit called “volt” that’s otherwise meaningless. Yes, you can get a voltmeter you can watch the needle, you can be told what the IEEE specification for defining how much work a volt should do or shock you. I’ve discussed scales and types of measurements in a previous article, It’s worth mentioning that you should not confuse domain with the representation and symbols of the units being used. Some domains are limited, such as degrees that measure planar angles. An angle can be from 0 to 360°, or it can be between zero and 2π radians.

Joe has an explanation but doesn’t have any concrete examples in psql. Here’s one from the PostgreSQL documentation:

CREATE DOMAIN us_postal_code AS TEXT
CHECK(
   VALUE ~ '^\d{5}$'
OR VALUE ~ '^\d{5}-\d{4}$'
);

The idea of a domain here is that you define a valid slice of some data type. We can do something similar with check constraints on an attribute, but the difference is that we’d need to create the check constraint for each relevant attribute, whereas the domain would include this check automatically, making it quite useful if we have multiple instances of, say, us_postal_code in our database. Then, we wouldn’t need to worry about creating a check constraint on each instance and ensuring that the code remains the same across the board.

This also leads to a very common sentiment in functional programming: make invalid states unrepresentable. In other words, make it impossible for a person or piece of code to generate a result in an invalid state. By defining a domain with the scope of our valid state, we make it impossible for someone to create a US postal code value that does not pass our check, and so we can’t have dirty data of this sort in our database.

Comments closed

Truncating All Tables while Preserving Foreign Keys in T-SQL

Ronald Kraijesteijn builds a script:

When testing a data warehouse, a common challenge is managing large datasets effectively. Often, you need to reset tables to a clean state, ensuring consistent testing environments. The most efficient way to clear a table is using the SQL command TRUNCATE TABLE. However, this command is not straightforward when foreign key constraints are present. In this article, we’ll explore a solution that temporarily disables constraints, allows truncation, and then restores the constraints—keeping your data model intact.

Click through for the script, which saves a record of all of the foreign key constraints, truncates each table, and then re-creates the foreign keys.

Comments closed

Thoughts on Primary and Foreign Key Constraints

Rob Farley lays out an argument:

I am NOT suggesting that data integrity is irrelevant. Not at all. But how often do we need an enforced primary key or foreign key?

Be warned – I’m not actually going to come to a conclusion in this post. I’m going to make cases for both sides, and let you choose where you stand. The T-SQL Tuesday topic this month is Integrity, so some other people might have written on a similar topic, and produce even more compelling arguments one way or another. I’m the host this time, so you’ll be able to find the round-up (once it’s there) here on the LobsterPot Solutions site too.

I will come to a conclusion and it is that OLTP systems need primary and foreign key constraints to work properly. In the post, Rob asks a question around the last time I saw a key violation error in my application. The good(?) news is that I have plenty of them in the last application I built on SQL Server, because I need to rely on a source system that dumps data and doesn’t actually check to see if existing records were there already. That means I can’t simply perform an inner join from my table to the source table, because I could get multiple records back. No, instead, I need to use a common table expression or APPLY operator, retrieve the max values from the flotsam and jetsam that exists, and make my code harder to follow and perform worse as a result.

Distributed warehousing systems don’t have enforceable keys because of the technical challenge of enforcing keys without having different nodes talk to each other. But these things also assume either that you’ve pre-validated all of the data (like in a Kimball model), that you don’t care about duplicate records or messiness, or that you’ll fix the problem again somewhere downstream. Which, in the case of Microsoft Fabric, is typically necessary by the time you put the data into a semantic model, as those things really don’t like duplicate records and this tends to mess up relationships between tables.

Comments closed