11.5 Performance Impact: Indexes on Foreign Key Columns

Right, let’s talk about performance. You’ve dutifully added your foreign keys, patting yourself on the back for enforcing data integrity. Good for you. But if you think you’re done, you’re about to get a very unpleasant surprise the first time you try to delete a record from your parent table and it takes a geological epoch to complete. The secret sauce, the thing that makes this whole relationship work without grinding your database to a halt, is the humble index on your foreign key column.

Think of it this way: your foreign key constraint is a bouncer at a club. Every time you try to DELETE a row from the parent table (the VIP list) or UPDATE its key, the bouncer has to check the child table (the main dance floor) to make sure no one down there is still clinging to that VIP’s coattails. Without an index on the child table’s foreign key column, the bouncer has to do a full table scan—walking through every single row in the entire club, shouting “Are you still with VIP #427?” This is, to put it mildly, absurd. It’s O(n) complexity for a job that should be nearly instantaneous.

An index turns that shout into a direct lookup. It’s the bouncer’s little electronic tablet that instantly tells him exactly which rows in the child table reference the VIP he’s about to kick out. The difference in performance isn’t just noticeable; it’s the difference between a system that scales and one that collapses under its own weight.

The Two Critical Operations You’re Protecting

This isn’t just about deletes. There are two main operations that absolutely require this index to be efficient:

DELETEs or UPDATEs on the parent table: This is the classic case I just described. The database must check for referencing rows in the child table to enforce the ON DELETE rule (RESTRICT, CASCADE, etc.). Without an index, this check is a full scan.
Joins from the child table back to the parent table: While the foreign key constraint itself doesn’t directly help this query, the index you (hopefully) created for it does. A query like SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id will fly if orders.customer_id is indexed.

The Code: Doing It Right (And Seeing What Happens Without It)

Let’s make this concrete. Here’s a classic parent-child relationship.

-- The parent table
CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE
);

-- The child table WITHOUT an index initially
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    order_date TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    customer_id INTEGER NOT NULL,
    amount DECIMAL(10,2),
    CONSTRAINT fk_orders_customer
      FOREIGN KEY(customer_id)
      REFERENCES customers(id)
      ON DELETE RESTRICT
);

Now, let’s populate this with a million orders spread across 100,000 customers. If you try to delete a customer now, the database has to check every single one of those million order rows to see if any belong to this customer. Terrible.

-- This will be painfully slow without an index
DELETE FROM customers WHERE id = 12345;

The fix is simple and transformative:

-- Create the index that should have been there all along
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders(customer_id);

Run that delete again. It’s now instantaneous. The database uses the index to find any references in logarithmic time, sees there are none (or identifies them precisely if there are), and completes the operation.

The “Why Didn’t It Just Do That For Me?” Question

This is the truly baffling part, and I call it out because it’s a legitimate head-scratcher. The SQL standard does not require a database to automatically create an index on a foreign key column. The designers’ reasoning, as best I can tell, is that while an index is almost always needed for performance, it’s not required for logic. It’s a storage decision, not a integrity decision. Also, there might be cases where the index isn’t needed (e.g., a tiny, static table), so they leave the choice to you. It’s a questionable choice, in my opinion, because it’s a foot-gun waiting to go off for beginners. Consider it a rite of passage: you get bitten by this once, and you never forget it again.

Best Practices and Edge Cases

The Order of Operations: It’s almost always better to create the index before you load a massive amount of data into the child table. Building an index on an empty table is cheap; building it on a 100-million-row table under load is a great way to get paged at 3 AM.
Covering Indexes: If your common query against the child table often needs more than just the ID, consider a composite index. For example, if you frequently query SELECT status FROM orders WHERE customer_id = ?, an index on (customer_id, status) would be a “covering index” and could be even faster.
Not a Silver Bullet: An index on a foreign key column is crucial, but it’s not magic. If your foreign key column has very low cardinality (e.g., a status column with only 3 values that references a parent table), the optimizer might still choose a sequential scan because it’s just faster to read the whole table than jump around via the index. Use your judgment.