8.3 CREATE TYPE AS ENUM: Defining Ordered Label Sets
Right, so you’ve outgrown booleans. TRUE and FALSE are great until you need MAYBE, SOON, or WHY_IS_THE_DATABASE_LIKE_THIS. This is where enumerated types, or ENUMs, come in. Think of them as a way to create your own, custom-ordered set of labels that are far more meaningful and constrained than just dumping text into a VARCHAR field. They’re perfect for things like statuses (('pending', 'processing', 'completed', 'failed')), user roles, or any finite list of states where you want the database itself to enforce what’s allowed.
The syntax is beautifully simple. You’re not altering a table; you’re creating a whole new type that can be used across your database.
CREATE TYPE order_status AS ENUM (
'pending_payment',
'payment_received',
'shipped',
'delivered',
'lost_in_transit'
);
Boom. You’ve just created a new type called order_status. Now you can use it in a table definition just like any built-in type.
CREATE TABLE orders (
id serial PRIMARY KEY,
product_id integer NOT NULL,
-- Look here, using our new type
status order_status NOT NULL DEFAULT 'pending_payment'
);
The Immediate Advantages: Why Bother?
Why go through the trouble? Three killer reasons: integrity, performance, and clarity.
- Data Integrity: This is the big one. Try to insert
'being_lazy'into thestatuscolumn. Go on, try it. The database will slap your hand with an error. This prevents the inevitable typos ('shiped') and invalid states that plague free-text columns. Your data is clean, and it stays clean. - Performance: Behind the scenes,
ENUMvalues are stored as integers (think 1, 2, 3, 4, 5), but they’re displayed and entered as their text labels. This makes them incredibly space-efficient compared toVARCHAR, especially for long labels, and comparisons are blazing fast. The query planner loves them. - Clarity & Documentation: The type itself is its own documentation. Anyone (or any application) looking at the schema instantly knows the exact set of valid values. It’s self-documenting and removes all ambiguity.
The Hidden Ordering (And Why It Matters)
Here’s something that surprises people: ENUM types are ordered. The order you list the labels in the CREATE TYPE command is their canonical order. This isn’t just alphabetical; it’s a real, definable order. This becomes critically important for comparison operations.
-- This will return 'delivered' because it comes AFTER 'shipped' in the enum definition.
SELECT GREATEST('shipped'::order_status, 'delivered'::order_status);
-- This returns TRUE. 'payment_received' is considered greater than 'pending_payment'.
SELECT 'payment_received'::order_status > 'pending_payment'::order_status;
This is fantastic for writing clean, logical queries. Want all orders that are at or beyond the ‘shipped’ state? Easy.
SELECT * FROM orders WHERE status >= 'shipped'::order_status;
This is where the real power lies. You’re not just storing labels; you’re storing states with inherent meaning and order.
The Rough Edges and Questionable Choices
The PostgreSQL developers are brilliant, but their implementation of ENUMs has a few… quirks. Let’s call them “character-building exercises.”
- Altering is a Pain: You cannot
DROPa value from anENUMif it’s being used in a table (for obvious safety reasons). You also can’t easily reorder values. To add a new value, you useALTER TYPE ... ADD VALUE, but that’s about it. The sanctioned way to make bigger changes is to create a new type, alter your table to use the new type, and then drop the old one. It’s a multi-step process that requires a certain amount of grit. Plan yourENUMvalues carefully upfront. - The
NULLDilemma: AnENUMcolumn can beNULLunless you explicitly mark it asNOT NULL. This is sometimes what you want (status unknown), but often it’s a trap. If your business logic doesn’t account for aNULLstatus, you’re in for a bad time. I almost always define myENUMcolumns asNOT NULLand create an explicit label like'unknown'or'not_set'if I need to represent that state. It’s cleaner and avoids a whole class of three-valued logic bugs. - Application Syncing: If your application code has a hardcoded list of values (e.g., in a class), you must keep that list in perfect sync with the database’s
ENUMtype. If you add'returned'to the databaseENUMbut forget to update your app, the app will start throwing errors on insert. Some ORMs can help reflect this, but it’s a point of friction you need to manage.
Best Practices from the Trenches
- Use
NOT NULL: As mentioned, do it. Embrace the constraint. It saves you from yourself. - Plan for Change: Think hard about your value list before you create it. Assume you’ll need to add a value later (
ALTER TYPE ... ADD VALUE ... BEFORE/AFTER ...is your friend here), but assume that removing or reordering will be a painful, offline operation. - Naming is Key: Use clear, unambiguous names.
order_statusis good.osis bad. The values themselves should be self-explanatory.'completed'is better than'done'. - Consider the Alternative (
CHECKConstraints): For a very small, static set of values that will never, ever change or need ordering, aCHECKconstraint on aTEXTcolumn can be simpler.CHECK (status IN ('active', 'inactive'))is perfectly valid. But the moment you need more than two or three values, or you care about their order, theENUM’s advantages in performance and expressiveness quickly win out.