22.2 EXPLAIN ANALYZE: Actual Rows, Loops, and Timing
Alright, let’s get our hands dirty with EXPLAIN ANALYZE. You’ve run EXPLAIN and seen the hypothetical plan, the database’s best guess at how it thinks your query will perform. EXPLAIN ANALYZE is where we move from theory to practice. It actually runs your damn query and comes back with a post-game report, showing you what really happened. This is where you separate the confident predictions from the messy, sometimes horrifying, reality.
The key difference in the output is the addition of two sets of columns: the estimated rows and the actual rows, and the actual time and loops. This is the gold. This is where you see the query planner’s crystal ball was either perfectly clear or completely fogged up.
Actual vs. Estimated Rows: The Planner’s Reckoning
Look at any node in an EXPLAIN ANALYZE plan. You’ll see something like (rows=100 loops=1) (actual rows=17 loops=1). The first part (rows=100) is the estimate from the standard EXPLAIN. The second part (actual rows=17) is the truth, measured in blood after running the query.
Why do these numbers differ? Because the planner’s statistics aren’t perfect. It’s making educated guesses based on histograms and null fractions and a bunch of other arcane data. When the actual rows is orders of magnitude larger than the estimate, the planner likely chose a bad join strategy or index scan based on this faulty intelligence. It thought it was bringing a knife to a fight, but it was actually a gunfight. Conversely, if the actual is much smaller, it might have over-allocated memory or done unnecessary work.
EXPLAIN ANALYZE
SELECT * FROM users WHERE last_login > now() - INTERVAL '1 year';
You might see a Seq Scan planned because the statistics underestimated how many of your users are actually active (a sad but common reality). The actual rows will show the true, massive number, and the execution time will be terrible. This is your cue that you need to run ANALYZE on that table to update statistics, or maybe, just maybe, add an index on last_login.
Timing and Loops: Where the Time Actually Goes
This is my favorite part. You get two timing metrics: actual time=0.042..12.517. The first number is the startup cost—the time in milliseconds to return the first row from this node. The second number is the total time to return all rows from this node. A node with a high startup cost (like a Sort) might take a while to get going, but then stream rows quickly.
Then you have loops=.... This tells you how many times this node was executed. For a top-level node in a simple query, it’s usually 1. But for nodes inside a nested loop? That’s where the fun begins.
EXPLAIN ANALYZE
SELECT u.username, p.content
FROM users u
INNER JOIN posts p ON u.id = p.user_id
WHERE u.email LIKE '%@example.com';
Imagine the plan shows a Nested Loop. The outer node (scanning users) might have (actual rows=50 loops=1). It found 50 users. For each of those 50 rows, it looped and executed the inner node (scanning posts). The inner node might show (actual rows=20 loops=50). It ran 50 times (once per user), and each time it found, on average, 20 posts. The total work done by the inner node wasn’t 20 rows; it was 50 * 20 = 1000 row fetches. This is critical for understanding the multiplicative effect of nested loops. If that inner node is a slow Index Scan, you’re going to feel those 50 loops.
The Perils of Cached Data
Here’s a classic “gotcha.” The first time you run EXPLAIN ANALYZE on a query, it has to pull data from disk. The next time you run it, the data might already be cached in memory, giving you a much faster execution time that isn’t representative of a cold start.
The solution? For true performance testing, you want to clear the cache between runs. But how? On a dedicated server, you can use the magnificent pg_prewarm extension to load the table into cache before your test, ensuring consistency. Or, you can just be aware of the issue and take the first run with a grain of salt. The actual rows and loops are still perfectly valid; it’s just the timing that gets skewed by caching.
A Word on Overhead
Remember, EXPLAIN ANALYZE runs the query. For a DELETE or UPDATE or a query that returns a million rows, this can take a very long time and actually change your database state. Be incredibly careful. I once accidentally nuked a small test table while demonstrating this because I forgot to wrap the DELETE in a transaction and roll it back. Don’t be me. For data-modifying queries, always do this:
BEGIN;
EXPLAIN ANALYZE DELETE FROM temp_table WHERE ...;
ROLLBACK; -- This cancels the DELETE!
The EXPLAIN ANALYZE will still show the plan and execution time as if the delete happened, but the ROLLBACK undoes it all. It’s the only safe way to satisfy your curiosity without also satisfying your desire to update your resume.