4.1 Categories of Built-in Types: Primitive, Composite, and Domain
Right, let’s talk about the building blocks. Before you can build anything clever, you need to know your bricks, mortar, and the occasional weird, custom-shaped brick that some previous architect decided was a good idea. In PostgreSQL, your data types are your bricks, and they fall into three main categories: Primitive, Composite, and Domain. Think of them as Lego bricks, pre-built Lego sets, and official Lego sets you’ve modified with a marker.
Primitive Types: Your Atomic Legos
These are the fundamental, indivisible types. The ones the database engine itself understands natively. If a type is implemented in C and the database knows how to sort it, add it, or compare it directly, it’s primitive. This is your integer, text, boolean, timestamp, uuid, and all the usual suspects.
The key thing to remember is their behavior is baked in. The rules for how an integer plus another integer works aren’t up for debate; it’s defined by the core code. This makes them fast, predictable, and your first choice for almost everything.
-- The classics. You know what these do.
SELECT 2 + 2 AS four, 'hello' || ' world' AS greeting, NOW() + INTERVAL '1 day' AS tomorrow;
But even here, PostgreSQL’s infamous (and wonderful) compulsion to be helpful creates some quirks. For example, try subtracting two timestamp values. You don’t get an integer back. You get an interval, which is a whole other primitive type designed for this exact purpose. It’s brilliant, but it can surprise you if you’re coming from a dumber database.
-- This doesn't give you a number of days. It gives you an interval.
SELECT NOW() - '2023-01-01 00:00:00' AS time_since_new_years;
Pitfall: The most common ‘gotcha’ with primitives is operator overloads. The + operator means different things for integer (addition), date (add days), and text (concatenation). If you try to add an integer to a date, PostgreSQL will politely cast the integer to an interval of days and make it work. But try to add an integer to a text field and it’ll just give up and throw an error. It’s not magic, it’s just a set of very specific, pre-defined rules.
Composite Types: Your DIY Lego Sets
This is where PostgreSQL starts to show off. A composite type is a single data structure that groups together multiple named fields, each with their own type. It’s like creating a custom row or record template.
You see them all the time without realizing it: every table you define automatically gets an associated composite type of the same name. But you can also create your own standalone ones. They’re fantastic for ensuring consistency when you’re passing around a well-defined bundle of data, especially within functions.
-- Let's define a type for a mailing address.
CREATE TYPE address AS (
street_address text,
city text,
postcode text
);
-- Now we can use it in a table. One column, 'shipping_address', holds the whole composite.
CREATE TABLE orders (
order_id serial PRIMARY KEY,
shipping_address address
);
-- Inserting data is a row constructor. Note the parentheses and single quotes.
INSERT INTO orders (shipping_address)
VALUES (ROW('123 Main St', 'Anytown', '12345'));
-- Querying it: you access fields with dot notation, wrapped in parentheses.
SELECT (shipping_address).city FROM orders;
Why you’d use it: Instead of having four separate columns for street1, street2, city, and zip scattered around your table, you bundle them into a logical unit. It makes your data model cleaner and your function signatures much more sane. The main pitfall? Not every client driver handles them gracefully. Some will map them to arrays or JSON, others might just barf. Always check your client library’s support before going all-in.
Domain Types: Legos You’ve Vetted and Labeled
This is one of PostgreSQL’s most powerful and underused features. A domain is a constraint on an existing primitive or composite type. You’re not creating a new low-level type; you’re creating a new name for an existing type with附加 rules. It’s the ultimate tool for data integrity.
Think of a percent value. It’s just an integer, right? But not just any integer. It should probably be between 0 and 100. You could just use integer and remember to add a CHECK constraint every single time you create a column that holds a percentage. Or… you could create a domain once and use it everywhere.
-- Create the domain. It's based on integer but must satisfy this constraint.
CREATE DOMAIN percent AS integer
CHECK (VALUE BETWEEN 0 AND 100);
-- Now use it. This column inherits the integer behavior BUT enforces the rule.
CREATE TABLE survey_results (
respondent_id integer,
satisfaction percent
);
-- This will work:
INSERT INTO survey_results VALUES (1, 85);
-- This will fail spectacularly and tell you exactly why:
INSERT INTO survey_results VALUES (2, 110);
-- ERROR: value for domain percent violates check constraint "percent_check"
The “Why”: Domains are the epitome of the “Don’t Repeat Yourself” (DRY) principle applied to your schema. They centralize your business logic. If you suddenly decide a percent can also be NULL, you change it in one place—the domain definition—and it propagates to every column that uses it. It’s a supercharged CHECK constraint. The only questionable choice here is that the designers didn’t make them more prominent in the documentation; they’re a secret weapon for anyone who cares about clean, maintainable data models. Use them.