10.5 CHECK Constraints: Expressing Business Rules
Right, so you’ve got your columns defined and your primary key set. Good. But a column’s data type is a pretty blunt instrument. INTEGER can hold the value -999999 just as happily as 42. For your actual data, that’s probably nonsense. This is where CHECK constraints come in. They’re your first and best line of defense against data that, while technically the right type, is complete garbage from a business logic perspective. Think of them as bouncers for your rows; if the data doesn’t meet the criteria, it doesn’t get in. Period.
The beauty of a CHECK constraint is its simplicity: it’s just a Boolean expression. If it evaluates to TRUE (or NULL, more on that later), the insert or update is allowed. If it’s FALSE, the operation fails. You can slap these directly on a column to check only that column’s value, or define them at the table level to check conditions across multiple columns.
The Basics: Column-Level CHECK Constraints
The most straightforward use is to validate a single column. Let’s say you’re building a table for a user profile. An age column should probably be a positive number, and for a B2B SaaS product, let’s assume users must be at least 18.
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
age INTEGER NOT NULL CHECK (age >= 18)
);
That CHECK (age >= 18) right after the column definition is the constraint. It’s clean, it’s readable, and it’s brutally effective. Try to insert a 17-year-old and the database will stop you cold. This is infinitely better than handling this logic in your application code because it’s immutable; no buggy code or clever developer can accidentally work around it.
Table-Level CHECKs for Multi-Column Logic
Column-level checks are powerful, but their real power is unlocked when you need to validate relationships between columns. This is a table-level constraint, defined separately from any single column. The classic, slightly tired-but-instructive example is ensuring a project’s end_date is after its start_date.
CREATE TABLE projects (
project_id SERIAL PRIMARY KEY,
project_name VARCHAR(100) NOT NULL,
start_date DATE NOT NULL,
end_date DATE,
CHECK (end_date IS NULL OR end_date > start_date)
);
See the elegance there? The CHECK allows for a NULL end_date (meaning the project is ongoing) but if an end_date is provided, it must be after the start_date. This is a business rule that cannot be expressed with column constraints alone.
The Infuriating NULL Gotcha
Here’s the part where I get to complain about the designers. Remember I said a CHECK passes if the expression is TRUE or NULL? This is a consequence of SQL’s three-valued logic (TRUE, FALSE, NULL) and it is, with all due respect, a massive foot-gun.
Look at this constraint:
CHECK (age <> 0) -- "age must not be zero"
Seems solid, right? Now, what happens if you insert a row where age is NULL?
The expression NULL <> 0 evaluates not to FALSE, but to NULL. And since NULL isn’t FALSE, the constraint is considered satisfied. Your NOT NULL constraint will save you here, but if the column is nullable, you’ve got a problem.
The fix is always to be explicit about NULL handling. If a value must not be null, use NOT NULL. In your CHECK constraints, if you want to disallow a condition that might involve NULLs, use operators and functions that are strict, or explicitly handle the NULL case.
The previous projects example did this correctly: CHECK (end_date IS NULL OR end_date > start_date).
Let’s break the age example on purpose to see it fail:
CREATE TABLE bad_example (
id SERIAL PRIMARY KEY,
age INTEGER CHECK (age <> 0) -- This is a BAD constraint
);
INSERT INTO bad_example (age) VALUES (NULL); -- This will succeed. Oops.
The correct, paranoid way to write it would be:
CREATE TABLE good_example (
id SERIAL PRIMARY KEY,
age INTEGER NOT NULL CHECK (age <> 0) -- Use NOT NULL, or handle NULL in the CHECK
);
Naming Your Constraints (This is Non-Negotiable)
You just create a constraint with CHECK (age >= 18). The database will happily accept this and then assign it a glorious, automatically generated name like users_age_check. This is a terrible idea. When this constraint eventually fails (and it will), the error message will reference that awful name. Worse, if you need to drop it later with ALTER TABLE, you have to go dig that name out of the system catalogs.
Always, always name your constraints. It takes two seconds and saves you hours of frustration.
CREATE TABLE users (
user_id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
age INTEGER NOT NULL CONSTRAINT must_be_adult CHECK (age >= 18),
CONSTRAINT sensible_username_length CHECK (char_length(username) >= 4)
);
-- A clear, understandable error: violates constraint "must_be_adult"
Performance and Best Practices
Don’t be afraid of them. A well-written CHECK constraint is incredibly cheap. The expression is evaluated once for each row being inserted or updated. It’s a minuscule cost compared to the I/O of the write itself.
However, avoid putting slow functions in your CHECKs. Don’t do a CHECK (my_value IN (SELECT ... FROM giant_table)). That’s what foreign key constraints are for. Keep the logic simple and self-contained: use comparisons, basic math, and simple, fast functions like char_length().
Your mission is to encode every single immutable business rule you can into CHECK constraints. They make your database correct. They turn what would be a vague “the app should handle that” into a hard, unforgiving rule. This is your data’s integrity we’re talking about. Don’t be polite about it.