Correlated | mikePietsch.com

17.6 Derived Tables: Subqueries in the FROM Clause

Alright, let’s talk about putting a subquery in the one place you probably didn’t expect it: the FROM clause. It feels a bit wrong, like putting a sofa in the kitchen. But sometimes, the kitchen needs a sofa. This is what we call a derived table, and it’s one of the most powerful tools for wrangling complex logic without losing your mind. Think of it this way: a regular SELECT works on a table. A derived table lets your SELECT work on the result of another SELECT. You’re effectively creating a temporary, in-memory table on the fly, just for the duration of your main query. It’s a fantastic way to break a monstrous, multi-part problem into a sane, two-step process: 1) build a smaller, focused result set, and 2) query against that.

17.5 ANY and ALL: Comparing Against a Set

Alright, let’s talk about ANY and ALL. These are the operators you use when you want to compare a single value not to just one other value, but to a whole set of values from a subquery. Think of them as the SQL way of saying, “Is my value bigger than any of those?” or “Is my value smaller than all of those?” They’re incredibly powerful, but they also have a weird, clunky syntax that feels like it was designed by someone who’d just finished a very long and tedious standards committee meeting. We’ll get through it together.

17.4 IN vs EXISTS: Semantics, NULL Handling, and Performance

Right, let’s settle this. The IN vs. EXISTS debate is one of those classic database arguments that can turn a friendly lunch into a heated theological debate. It’s not about which one is universally “better”—that’s a rookie take. It’s about understanding their different DNA: how they think, how they handle the weird stuff (looking at you, NULL), and when one will run circles around the other. The core of the issue is a difference in intent. IN is fundamentally about checking a value against a list. EXISTS is about checking for the existence of any row that meets a condition. This semantic difference dictates everything that follows.

17.3 EXISTS: Testing for the Presence of Rows

Now, let’s talk about EXISTS, which is arguably the most elegant and powerful tool in the subquery arsenal. Forget about pulling actual data out of a subquery for a moment. EXISTS answers a single, beautifully simple question: “Is there at least one row in there that meets my condition?” Think of it like calling a friend to check if a party is any good. You don’t need them to list every single person, the playlist, and the brand of chips being served. You just need a yes or no: “Is it popping off?” That’s EXISTS. It returns TRUE the microsecond it finds one single row that matches, and FALSE if it scans the entire set and finds nada. This makes it brutally efficient, especially compared to operations that need to process and return entire result sets.

17.2 Correlated Subqueries: Referencing the Outer Query

Right, let’s talk about correlated subqueries. This is where the magic happens, and also where you can accidentally summon a performance demon that will devour your server’s soul. The core idea is simple, but the implications are huge. Unlike its simpler cousin, the scalar subquery, which executes once and returns a single value, a correlated subquery is a little gossip. It can’t do its job without information from the outside world—specifically, the current row being evaluated by the outer (or “main”) query. It’s executed not once, but once for every single row the outer query considers. Let that sink in. If your outer query returns 10,000 rows, your subquery runs 10,000 times. This is why we get the big bucks.

17.1 Scalar Subqueries: A Single Value in an Expression

Alright, let’s talk about scalar subqueries. This is where we start getting fancy, and frankly, where a lot of you will start writing queries that make your future self (or the poor soul who has to maintain your code) weep quietly at their desk. A scalar subquery is, in essence, a SELECT query you shove inside another query that has the decency to return exactly one value: one row, with one column. Think of it as a single, atomic piece of data you can slot right into an expression, anywhere you’d normally put a literal value like 42 or a column name like users.name. It’s the most well-behaved of the subquery family. The one that gets invited to nice functions because it won’t cause a scene.