4.6 Composite Types: Creating and Querying Row Types
Right, so you’ve mastered the basic scalar types – integers, text, booleans, the usual suspects. They’re the solo artists. But data is rarely about lone values; it’s about relationships. A person has a first name and a last name. A point on a graph has an x-coordinate and a y-coordinate. Enter the Row Type, PostgreSQL’s brilliantly simple way to keep these related values bundled together. Think of it as a single-column table, a structured record you can pass around as a single unit. It’s the database’s way of saying, “These things belong together, so let’s treat them that way.”
Constructing a Row Type: It’s Literally Parentheses
You don’t need to pre-define a type to use one. The most common way is with a Row Constructor, which is just a fancy term for putting comma-separated values inside parentheses.
SELECT ('John', 'Doe', 42) AS person;
This returns a single column of type record. It’s anonymous and a bit useless on its own. Its real power is unlocked when you store it in a table or, more importantly, when you pair it with a predefined table structure.
Let’s get practical. Say we’re modeling a sensors table where each sensor reports a 2D coordinate.
CREATE TABLE sensors (
id SERIAL PRIMARY KEY,
location POINT, -- This is a built-in type, but let's pretend it's not for this example.
reading INTEGER
);
-- Instead, we can use a custom row type for the location
CREATE TABLE sensors (
id SERIAL PRIMARY KEY,
location ROW(x INT, y INT), -- Defining the composite type right here
reading INTEGER
);
Now, inserting data is a matter of providing the structured value.
INSERT INTO sensors (location, reading)
VALUES (ROW(10, 20), 42);
You can also just use the parentheses syntax; the ROW keyword is optional but can add clarity.
INSERT INTO sensors (location, reading)
VALUES ((15, -5), 87); -- This works exactly the same
Querying the Beast: Dot Notation is Your Key
This is where most people’s brains short-circuit for a second. You can’t just SELECT location and get something pretty. You’ll get the whole composite value in its native, brackety format (10,20). To get at the good stuff inside, you use dot notation, just like you would in most programming languages.
-- Don't do this. It's technically correct but ugly.
SELECT id, location FROM sensors;
-- Do this. Extract the fields you actually want.
SELECT id, (location).x, (location).y, reading FROM sensors;
Crucial Pitfall Alert: See those parentheses around (location)? They are NOT optional. The query parser gets deeply confused without them because it tries to interpret location.x as a column named x from a table aliased as location. You must wrap the composite column to tell PostgreSQL “evaluate this first, then get its field.”
-- This will likely throw a "missing FROM-clause entry" error
SELECT id, location.x FROM sensors;
-- This is the correct way
SELECT id, (location).x FROM sensors;
The Power of WHERE and UPDATE
This is where row types stop being a neat party trick and start being genuinely powerful. You can use the fields within a row type in your WHERE clauses and UPDATE statements.
-- Find all sensors in a specific vertical strip
SELECT * FROM sensors WHERE (location).x BETWEEN 10 AND 20;
-- Update just the y-coordinate of a sensor's location
UPDATE sensors
SET location.y = 100
WHERE id = 1;
-- You can even update the entire row construct
UPDATE sensors
SET location = ROW((location).x, 100) -- Keeps x, sets y to 100
WHERE id = 1;
Why Bother? The Case for Compositeness
You might be thinking, “Why not just have separate x_coord and y_coord columns?” Fair question. The row type gives you abstraction. You can pass a location around as a single value. Functions can accept or return a ROW(x int, y int) type. This is a stepping stone to more complex operations. For instance, you can check if two entire locations are equal: WHERE sensor1.location = sensor2.location. Doing that with two separate columns is clunky and error-prone. It enforces a logical grouping that separate columns simply don’t. The designers got this one right – it’s a clean, logical way to model intrinsic composites without the overhead of a full join. Use it.