18.6 MATERIALIZED and NOT MATERIALIZED CTE Optimization Hints
Now, let’s talk about the two most misunderstood keywords in the CTE world: MATERIALIZED and NOT MATERIALIZED. You might see these in a query plan and think, “Aha! A lever I can pull to make my query go vroom!” Slow down, my friend. These aren’t magic performance switches; they’re optimization hints. And like most hints given to a query planner, they can be politely ignored if the planner thinks it knows better (which, maddeningly, it often does).
The core concept is simple: should the database physically materialize the result of this CTE into a temporary table before the main query uses it, or should it treat it more like an inline view, merging its logic into the surrounding query? MATERIALIZED says “build a temp table.” NOT MATERIALIZED says “please don’t, just inline it.” The default behavior, crucially, is not set in stone. It’s a fantastic game of “it depends on your database vendor and version.”
The Default Dance: PostgreSQL vs. The World
This is where the absurdity begins. The SQL standard is silent on this, so everyone just wings it.
In PostgreSQL, the optimizer is aggressively smart. By default, it will almost always try to inline your CTE into the main query. It’s like a master chef who sees your pre-chopped vegetables (the CTE) and says, “I appreciate the effort, but I can integrate chopping these onions directly into my mise en place for a more efficient workflow.” This is fantastic for performance… until it isn’t. Sometimes you want that CTE evaluated only once. For years, the only way to force this in PostgreSQL was a horrifyingly ugly hack: wrapping the CTE in SELECT * FROM ( ... ) to defeat the optimizer’s inlining logic. The MATERIALIZED hint is a formal, sane way to say “No, seriously, chop the onions first.”
Meanwhile, in other databases like MySQL or older versions of SQL Server, the default behavior was often to materialize CTEs automatically, treating them as a optimization fence. This could prevent a bad plan from the main query from corrupting the CTE, but it could also needlessly slow things down if the materialized result was huge.
When to Force Materialization
So, when do you grab the MATERIALIZED hint? Think of it as a tool for when you want to break up a complex query and avoid the planner making a mess of it.
You’re Reusing a Costly Calculation: If you reference the CTE multiple times in your main query and it’s expensive to compute, you want it to run once. Let’s say you’re calculating a complex aggregate that you then join on and also use in a
WHEREclause. Inlining might cause that calculation to happen multiple times.WITH expensive_cte AS MATERIALIZED ( SELECT user_id, SUM(complex_calculation(revenue, costs)) AS profit FROM gigantic_fact_table GROUP BY user_id ) SELECT a.user_id, a.profit, o.order_date FROM expensive_cte a JOIN orders o ON a.user_id = o.user_id WHERE a.profit > 1000;Here, you’re telling the planner: “Calculate the profit for everyone first, put it in a temp box, then do the join and filter. Don’t try to be clever and merge this. I know what I’m doing.”
You’re Fighting the Optimizer: Sometimes the query planner just picks a tragically bad plan when it inlines a CTE. Forcing materialization acts as an optimization fence, creating a logical break point. The planner has to finish the CTE before moving on, which can prevent it from choosing a path that would be disastrous for the overall query. It’s a blunt instrument, but sometimes you need one.
When to Force Inlining (NOT MATERIALIZED)
The NOT MATERIALIZED hint is more niche. You use it when you’re sure the default behavior would be to materialize, but you know that’s a terrible idea.
The classic example is pushing a predicate down. Imagine a default behavior that materializes everything:
WITH potential_customers AS NOT MATERIALIZED (
SELECT user_id, email
FROM users
WHERE country = 'US'
)
SELECT *
FROM potential_customers
WHERE email LIKE '%@gmail.com';
If the database materialized potential_customers, it would first create a temporary table of every single US user. Then it would scan that entire temp table for Gmail addresses. That’s horrendous. By hinting NOT MATERIALIZED, you’re begging the planner to inline the logic, turning the final query into SELECT * FROM users WHERE country = 'US' AND email LIKE '%@gmail.com', which can use indexes on country and email far more effectively.
The Cold, Hard Truth About Hints
Here’s the brutal honesty: these hints are just that—hints. The SQL standard calls them “implementation-defined.” Your database’s query planner is a towering genius with a Ph.D. in set theory and also a complete idiot. It reserves the right to look at your carefully crafted MATERIALIZED hint and say, “LOL, no,” because it has statistics that tell it your CTE will produce 3 billion rows and materializing it would grind the system to a halt. You are suggesting, not commanding.
The best practice? Don’t start with hints. Write the query clearly. See what the EXPLAIN plan gives you. If it’s pathological, then consider a hint. Use MATERIALIZED to enforce a single evaluation or to corral a misbehaving planner. Use NOT MATERIALIZED to encourage predicate pushdown and index use. But always, always check the plan again afterward to make sure your “help” was actually helpful. You’re not a passenger; you’re a co-pilot. Sometimes you have to grab the wheel.