22.5 Hash Join, Nested Loop, and Merge Join: When Each Is Chosen

Right, let’s get into the real meat of the matter: how your database decides to smoosh two tables together. Forget the textbook definitions for a second. When the planner picks a join algorithm, it’s making a high-stakes bet based on the size of your data, available indexes, and memory. Get this wrong, and a query that should be milliseconds can become a coffee-break-length disaster. I’m here to make sure you can read that betting slip.

The Humble (and Sometimes Villainous) Nested Loop

Think of a nested loop join as the database’s version of a brute-force search. For every single row in the outer table (the one on the left in your FROM clause), it does a full scan of the inner table (the one on the right) to find matching rows. Yes, it’s as inefficient as it sounds, which is why its performance is O(n*m). Don’t let that scare you—it’s actually the MVP in two very specific scenarios.

First, if you’re joining a tiny table to another table. I’m talking about a lookup table with five rows. Even if the inner table is huge, five loops is nothing. Second, and this is the big one, if there’s an index on the inner table’s join column. The inner loop isn’t a scan anymore; it’s a super-fast index lookup. The planner is smart enough to see that index and say, “Yeah, I can do this quickly.”

Check out this classic “good Nested Loop” scenario. We’ll get a plan where the inner table is accessed via an Index Scan.

EXPLAIN ANALYZE
SELECT u.username, p.content
FROM users u
INNER JOIN posts p ON u.id = p.user_id
WHERE u.id = 123;

You’ll likely see a plan like this (I’m paraphrasing the output for clarity):

Nested Loop (cost=... actual time=...)
  ->  Index Scan using users_pkey on users u (...
        Index Cond: (id = 123)
  ->  Index Scan using posts_user_id_idx on posts p (...
        Index Cond: (user_id = 123)

Perfect. It found the user instantly via the primary key, then used the index on posts.user_id to find that user’s posts just as fast. This is a nested loop working as intended. But take that index away on posts.user_id, and this same query would morph into a catastrophe, forcing a sequential scan of the entire posts table for that one user. This is the most common pitfall with nested loops: a missing index on the inner table’s join key turning a good plan into a terrible one.

The Workhorse: Hash Join

When you’re joining two large tables and there are no indexes to save you, the Hash Join is your savior. It’s the planner’s go-to for big, ad-hoc analytics queries. Here’s the two-step magic trick: 1) It scans the smaller of the two tables and builds an in-memory hash table, where the key is the join column. 2) It then scans the larger table, hashes each row’s join column, and probes the hash table for matches.

Its beauty is its O(n + m) complexity. It does one pass to build the hash table and one pass to probe it. The catch? It needs enough work_mem to fit that entire hash table in RAM. If your hash table is too big, it gets dumped to disk (temporary files), and performance falls off a cliff. You’ll see the dreaded Buffers: temp written=... in your EXPLAIN ANALYZE output, which is your cue to either increase work_mem or question your life choices.

SET work_mem = '100MB'; -- Let's give it some breathing room

EXPLAIN ANALYZE
SELECT o.order_total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.order_date > CURRENT_DATE - INTERVAL '30 days';

The planner will pick a Hash Join here because it’s likely joining a large chunk of the orders table to the customers table. It’ll choose the smaller of the two (probably the filtered orders) to build the hash. Best practice? Always make sure work_mem is set high enough to avoid those temp files for your common workload. It’s a classic tuning knob.

The Precision Instrument: Merge Join

The Merge Join is the elegant, sophisticated cousin. It requires both input sets to be sorted on the join key. It then does a single pass through each sorted list, like a zipper closing, matching rows as it goes. This is incredibly efficient, O(n + m), and doesn’t require any work_mem for a hash table. So why isn’t it used all the time?

Because the cost of sorting is enormous. The planner will only choose a Merge Join in two cases: 1) You already have indexes on the join keys for both tables, which means the data is already sorted and ready to go. Or 2) It needed to sort the data anyway for an ORDER BY clause on the join key. It’s a classic “well, since we’re already sorting…” situation.

It’s the preferred method for large, sorted datasets. You’ll often see it when joining on primary keys.

EXPLAIN ANALYZE
SELECT *
FROM large_sorted_table_a a
INNER JOIN large_sorted_table_b b ON a.key = b.key;

If both a.key and b.key have indexes, the planner can skip the sorting phase and just merge the two pre-sorted streams. If you see a Merge Join in a plan where you weren’t expecting it, look for a sort operation (Sort) upstream—that’s your performance bottleneck. The takeaway? A Merge Join is fast, but the sorts it might require can be a hidden tax.