30.3 Control Flow: IF, CASE, LOOP, WHILE, FOR, and FOREACH

Right, let’s talk about making your code do more than just fall in a straight line from top to bottom. That’s what control flow is for: it’s the steering wheel, the brakes, and the occasionally useful “oh crap, ejector seat” for your logic inside a stored procedure. Without it, you’re just executing one statement after another like a shopping list. With it, you can build actual intelligence into your database.

PL/pgSQL gives you a robust, if sometimes a bit verbose, set of tools for this. We’ll cover the big ones, but I’ll warn you now: some of these are workhorses, and some are tools you’ll reach for only when you have a very specific, weird job to do.

The Humble `IF` Statement

This is your bread and butter. It does exactly what you think it does. The syntax is a bit more… wordy than what you might be used to in, say, Python. We have IF, ELSIF (note: it’s ELSIF, not ELSEIF—a classic bit of PostgreSQL quirkiness), ELSE, and we close it with END IF;. You must close it. Forget the END IF; and the parser will throw a tantrum.

Here’s the basic structure:

IF condition THEN
    -- statements
ELSIF another_condition THEN
    -- more statements
ELSE
    -- final statements
END IF;

Why is it so verbose? Because it’s designed for clarity in what can become very complex blocks of logic. It’s harder to mess up the scoping when everything is explicitly stated. Let’s use a real example. Suppose you’re managing user statuses:

CREATE OR REPLACE FUNCTION update_user_status(user_id integer, new_points integer)
RETURNS void AS $$
BEGIN
    IF new_points > 1000 THEN
        UPDATE users SET status = 'platinum' WHERE id = user_id;
    ELSIF new_points > 500 THEN
        UPDATE users SET status = 'gold' WHERE id = user_id;
    ELSIF new_points > 100 THEN
        UPDATE users SET status = 'silver' WHERE id = user_id;
    ELSE
        UPDATE users SET status = 'bronze' WHERE id = user_id;
    END IF;
END;
$$ LANGUAGE plpgsql;

Pitfall: The most common mistake here is not understanding that the conditions are evaluated in order. The first one that is TRUE wins. So if you wrote IF new_points > 100 THEN... first, your platinum and gold users would never be found because they’d get caught by the first condition. Order matters. Start with the most specific condition first.

The `CASE` Statement (and Expression)

CASE is IF’s more structured cousin. It’s fantastic when you’re comparing the same variable or expression against a bunch of different possible values. It’s cleaner and often more efficient than a long chain of ELSIFs. There are two forms: the simple and the searched.

The simple form is like a switch statement in other languages:

CASE search_expression
    WHEN value1 THEN
        statements
    WHEN value2 THEN
        statements
    ...
    ELSE
        statements
END CASE;

The searched form is more powerful, as each WHEN clause can have a completely different condition, just like an IF/ELSIF chain:

CASE
    WHEN condition1 THEN
        statements
    WHEN condition2 THEN
        statements
    ...
    ELSE
        statements
END CASE;

Here’s the searched form in action, which is generally more useful:

CREATE OR REPLACE FUNCTION get_discount_category(order_total numeric)
RETURNS text AS $$
DECLARE
    discount_text text;
BEGIN
    CASE
        WHEN order_total > 1000 THEN
            discount_text := 'Tier 1 (10% off)';
        WHEN order_total > 500 THEN
            discount_text := 'Tier 2 (5% off)';
        WHEN order_total > 100 THEN
            discount_text := 'Tier 3 (2% off)';
        ELSE
            discount_text := 'No discount';
    END CASE;

    RETURN discount_text;
END;
$$ LANGUAGE plpgsql;

Crucial Insight: Remember, CASE inside a PL/pgSQL block is a statement. You also have access to the CASE expression (used in SELECT statements) which is slightly different and doesn’t use END CASE; but just END;. Don’t mix them up. The parser will, once again, be very unhappy.

Looping Mechanisms: `LOOP`, `WHILE`, and `FOR`

Ah, loops. Where performance goes to die, if you’re not careful. Use them judiciously. Doing heavy processing row-by-row in a loop is usually a sign you should be thinking in sets and using a single SQL statement. But sometimes, you need to loop.

The LOOP / EXIT / END LOOP construct is the most basic. It’s an infinite loop unless you explicitly EXIT. This is your “I’ll know when I’m done when I see it” loop.

CREATE OR REPLACE FUNCTION factorial(n integer)
RETURNS integer AS $$
DECLARE
    result integer := 1;
    counter integer := n;
BEGIN
    LOOP
        EXIT WHEN counter <= 1; -- This is how you break out
        result := result * counter;
        counter := counter - 1;
    END LOOP;
    RETURN result;
END;
$$ LANGUAGE plpgsql;

The WHILE loop is a bit more civilized. It checks the condition at the top of the loop.

WHILE condition LOOP
    statements
END LOOP;

Let’s rewrite that factorial function more cleanly:

CREATE OR REPLACE FUNCTION factorial_while(n integer)
RETURNS integer AS $$
DECLARE
    result integer := 1;
    counter integer := n;
BEGIN
    WHILE counter > 1 LOOP
        result := result * counter;
        counter := counter - 1;
    END LOOP;
    RETURN result;
END;
$$ LANGUAGE plpgsql;

The FOR loop is arguably the most useful. It comes in two main flavors: iterating over a range of integers and iterating over the result of a query (FOREACH is a special case for arrays, which we’ll get to next).

The integer FOR loop is straightforward:

FOR counter IN 1..10 LOOP
    -- counter will be 1, 2, 3, ... 10
    statements
END LOOP;

-- You can also reverse it
FOR counter IN REVERSE 10..1 LOOP
    -- counter will be 10, 9, 8, ... 1
    statements
END LOOP;

-- And you can specify a step with BY
FOR counter IN 1..10 BY 2 LOOP
    -- counter will be 1, 3, 5, 7, 9
    statements
END LOOP;

The `FOREACH` Loop (for Arrays)

This one is a lifesaver when you’re dealing with arrays. The IN option loops over the elements of an array. The more recent SLICE option (added in PG 9.6) is for navigating multidimensional arrays, but that’s a niche for another day.

The basic syntax is:

FOREACH element_variable IN ARRAY array_variable LOOP
    statements
END LOOP;

Example: Let’s say you pass in an array of user IDs to deactivate.

CREATE OR REPLACE FUNCTION deactivate_users(user_ids integer[])
RETURNS void AS $$
DECLARE
    user_id integer;
BEGIN
    FOREACH user_id IN ARRAY user_ids LOOP
        UPDATE users SET active = false WHERE id = user_id;
        -- Maybe log this action to another table here?
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Massive Pitfall Alert: See what I did there? I updated a single row for each ID in a loop. This is a classic anti-pattern. For a large array, this will be brutally slow due to the overhead of each individual transaction. The right way to do this is to use a single UPDATE statement with = ANY().

UPDATE users SET active = false WHERE id = ANY(user_ids);

So why does FOREACH exist? For when you need to perform procedural logic for each element, not just a simple SQL operation. Maybe for each user ID, you need to call another function, perform a complex calculation, or insert into multiple tables based on the result. Use it when the work inside the loop can’t be expressed as a single, efficient set-based SQL statement. Otherwise, you’re just building a Rube Goldberg machine to hammer a nail.

The Humble IF Statement

The CASE Statement (and Expression)

Looping Mechanisms: LOOP, WHILE, and FOR

The FOREACH Loop (for Arrays)

The Humble `IF` Statement

The `CASE` Statement (and Expression)

Looping Mechanisms: `LOOP`, `WHILE`, and `FOR`

The `FOREACH` Loop (for Arrays)