Alright, let’s talk about the trio of window functions that are all about reaching into your result set and plucking out a specific value from another row. FIRST_VALUE, LAST_VALUE, and NTH_VALUE are the Swiss Army knives for those “I need the value from the first, last, or nth row in this partition” moments. They seem deceptively simple, but they have a few quirks that will absolutely bite you if you’re not careful.

The Straightforward One: FIRST_VALUE

FIRST_VALUE is the reliable friend in this group. It does exactly what it says on the tin: it gets the first value in the window frame for a given column. Its most common use case is answering questions like, “What was this user’s first-ever order amount?” or “What was the starting salary for each employee in their department?”

The syntax is simple: FIRST_VALUE(column_name) OVER (...). Let’s see it in action.

-- Let's find the first order each customer ever made
SELECT
    customer_id,
    order_date,
    order_amount,
    FIRST_VALUE(order_amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS first_order_amount
FROM orders;

Wait, why did I include that verbose ROWS BETWEEN clause? Here’s the first major pitfall, and it’s a doozy.

The Quirky One: LAST_VALUE (and the Window Frame Trap)

If you just use the default window frame, LAST_VALUE will lie to your face. I’m not joking. This is the single biggest “gotcha” in this entire chapter.

By default, if you just write ORDER BY order_date, the window frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Read that again. For each row, the “last value” is the last value so far, not the last value in the entire partition. This means on the first row, LAST_VALUE will return the first value. On the second row, it will return the second value. It’s functionally useless in its default state.

To get the actual last value in the partition, you must explicitly expand the window frame to include the entire partition.

-- The WRONG way (default frame). This will just return the current row's value.
SELECT
    customer_id,
    order_date,
    order_amount,
    LAST_VALUE(order_amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
    ) AS bad_last_value -- Don't do this!

-- The RIGHT way (explicit frame). This gets the genuine last value.
SELECT
    customer_id,
    order_date,
    order_amount,
    LAST_VALUE(order_amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS true_last_order_amount
FROM orders;

This behavior is, frankly, a questionable design choice. It makes the default behavior of LAST_VALUE counterintuitive and almost always wrong for its intended purpose. Always, always specify the frame.

The Precise One: NTH_VALUE

NTH_VALUE is the specialist. It fetches the value from the nth row in the window frame (e.g., 2nd, 3rd, 10th). Its syntax is NTH_VALUE(column_name, n) OVER (...). And yes, it also suffers from the same window frame trap as LAST_VALUE. You’ll almost always want that UNBOUNDED PRECEDING/FOLLOWING frame.

A crucial point: if you ask for the 5th value in a partition that only has 3 rows, it will return NULL. This is a good thing! It doesn’t throw an error; it just gives you nothing. You can handle this with COALESCE or a CASE statement if you need a default.

-- Find each customer's second order amount. Handles cases where they only ordered once.
SELECT
    customer_id,
    order_date,
    order_amount,
    NTH_VALUE(order_amount, 2) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS second_order_amount,
    COALESCE(
        NTH_VALUE(order_amount, 2) OVER (
            PARTITION BY customer_id
            ORDER BY order_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        ),
        0 -- Default to 0 if there was no second order
    ) AS second_order_amount_or_zero
FROM orders;

Best Practices and Performance

  1. Frame It or Forget It: Never use LAST_VALUE or NTH_VALUE without explicitly defining the window frame. Make it a habit. Your future self will thank you.
  2. NULL Handling: Be conscious that NTH_VALUE can return NULL. Plan for it unless you’re absolutely certain every partition is large enough.
  3. Ordering is Everything: The ORDER BY clause in the OVER() clause is what defines “first,” “last,” and “nth.” If you don’t specify an order, the results are non-deterministic. The database will just pick a row, and you’ll hate your life while debugging.
  4. Performance: These functions require a sort and a window scan. On huge datasets, this can be expensive. If you’re only interested in the first value for each group, a DISTINCT ON (PostgreSQL) or a correlated subquery might sometimes be faster, though often less readable. Always check the query plan if performance is critical.

These functions are incredibly powerful for creating row-level comparisons within groups. Just remember the frame trap. It’s the difference between getting the right answer and getting a subtly wrong one that makes it into a production report and causes a panic six months later. I’ve seen it happen. Don’t be that person.