5.2 serial and bigserial: The Legacy Auto-Increment Types

Alright, let’s talk about serial and bigserial. You’ve probably seen these types all over the place, and you might have even used them as your go-to “auto-incrementing primary key” type. Here’s the first thing you need to know: they don’t actually exist.

Wait, what? Don’t panic. Let me explain. serial and bigserial are not genuine data types in the way integer or text are. They are what we call a “shorthand” or “convenience notation”—a bit of syntactic sugar Postgres provides to make your life easier, which is a rare and beautiful thing in a database system. Under the hood, they are just a regular integer type (integer for serial, bigint for bigserial) with a sequence and a default value slapped on. It’s a facade, but a incredibly useful one.

How the Illusion is Maintained

When you execute a CREATE TABLE statement using a serial type, Postgres performs a neat little magic trick for you. It does three things:

Creates a Sequence: It automatically creates a new sequence object, which is just a special single-row table that spits out the next number in line.
Sets the Default: It sets the DEFAULT clause for your column to pull the next value from that sequence.
Ownership (The Important Bit): It links the sequence to the column. This is crucial because it means if you drop the table or the column, the sequence gets dropped too. This prevents nasty “orphaned sequence” errors later on.

You can see this magic for yourself. Let’s run this:

CREATE TABLE my_awesome_table (
    id serial PRIMARY KEY,
    name text NOT NULL
);

Now, let’s peek behind the curtain. If you look in the information_schema or pg_catalog tables, you won’t find a serial type. You’ll find an integer column. You can also see the sequence it created:

-- See the sequence that was created
SELECT pg_get_serial_sequence('my_awesome_table', 'id');

-- The result will be something like: "public.my_awesome_table_id_seq"

-- Manually get the next value from it, just to prove it's there
SELECT nextval('public.my_awesome_table_id_seq');

The Pitfalls and “Oh Crap” Moments

This facade is mostly great, but the seams show in a few places, and you need to be aware of them or you’ll get a nasty surprise.

1. Manual INSERTs are a Trap: The most common “oops” moment. The DEFAULT keyword is what pulls the value from the sequence. If you manually specify a value for the serial column, it will use your value and not update the sequence. This is how you eventually get a primary key violation.

INSERT INTO my_awesome_table (id, name) VALUES (1000, 'I am in control!');

-- The next INSERT will try to use the sequence's next value (probably 1)
-- and will fail with a duplicate key error if value 1 already exists.
INSERT INTO my_awesome_table (name) VALUES ('I will cause a conflict');

The fix? After a manual insert, you have to manually reset the sequence to a value higher than your maximum. It’s a pain.

2. The Gap Problem: Sequences are not transaction-safe in the way you might hope. If a transaction rolls back, the value drawn from the sequence is lost forever. This is by design—to avoid becoming a major bottleneck, sequences operate outside the normal transaction rules. The upside is blinding speed. The downside is that your primary keys will have gaps. 1, 2, 3, [transaction rolls back], 4, 5… This drives some people nuts, but you just have to get over it. Your primary key’s only job is to be unique, not to be aesthetically pleasing.

Identity Columns: The Modern Heir

This is the most important part of this section. Since PostgreSQL 10, we have a better, more SQL-standard way to do this: GENERATED AS IDENTITY.

This feature makes the auto-increment behavior a first-class citizen of the table definition, not a clever hack.

CREATE TABLE my_better_table (
    id integer GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    name text NOT NULL
);

Why is this better?

It’s Standard SQL: Your schema knowledge becomes more portable.
It’s Safer: You can specify GENERATED ALWAYS AS IDENTITY. If you try to manually INSERT a value into this column, it will throw an error unless you explicitly OVERRIDING SYSTEM VALUE. This prevents the “manual insert” pitfall I mentioned above. It forces you to be intentional about breaking the pattern.
It’s Clearer: The behavior is defined right there in the column definition. There’s no need to know the old serial shorthand magic.

Best Practices and Recommendations

So, what should you do today?

For new projects, use IDENTITY columns. Full stop. They are the correct, modern, and safer way to implement this behavior. serial is legacy syntax at this point.
If you’re maintaining an older database, understand the serial trick. You’ll need to know how to find and manage those sequences (pg_get_serial_sequence is your friend).
Choose the right size. This is critical. Using serial (an integer) gives you a maximum of 2.1 billion values. For most tables, this is fine. For a table storing user click events or IoT sensor data? You will absolutely blow past that. Use bigserial (or BIGINT GENERATED AS IDENTITY) which gets you into the quintillions. The small space you save with a 4-byte integer vs an 8-byte bigint is not worth the apocalyptic failure of your application running out of primary key values.
Never assume a sequence is gap-free. Your application logic must not care about gaps. If you need a gap-free sequence of numbers for, say, invoice numbers, you cannot use a serial or IDENTITY column. You need a different system entirely, and you should be very afraid because that’s a hard problem.