Alright, let’s get our hands dirty with two of the most useful tools in the SQL toolbox: LAG and LEAD. If you’ve ever found yourself staring at a row of data and desperately wanting to peek at the row above or below it to calculate a difference, a trend, or just to see what that idiot user did next, these functions are about to become your best friends. They are the quintessential “time-travel” functions of SQL, allowing you to reference other rows in your result set relative to your current row without the soul-crushing misery of a self-join.

The concept is brilliantly simple. LAG lets you look backwards at a previous row. LEAD lets you look forwards at a next row. It’s like having a rearview mirror and a windshield for your data. You provide the column you want to look at, and optionally, how many rows to skip back or forward.

The Basic Syntax: It’s Easier Than It Looks

Here’s the blueprint. Don’t worry, it’s less intimidating than it seems.

LAG(column_name [, offset [, default_value]]) OVER (
    [PARTITION BY partition_expression]
    ORDER BY order_expression
)

LEAD(column_name [, offset [, default_value]]) OVER (
    [PARTITION BY partition_expression]
    ORDER BY order_expression
)
  • column_name: The value you want to retrieve from another row.
  • offset (optional): The number of rows to go backwards (LAG) or forwards (LEAD). Defaults to 1 if you don’t specify it. Want to look two rows back? LAG(column, 2).
  • default_value (optional): What to return if the LAG or LEAD function tries to look beyond the scope of the partition. If you don’t specify this, it will return NULL, which is usually fine but can break your math if you’re not careful (more on that later).
  • PARTITION BY: This is your “reset” button. The function will only look within the group of rows that share the same value(s) in the partition. No partitioning? It treats the whole result set as one big group.
  • ORDER BY: This is non-negotiable and absolutely critical. This defines what “previous” and “next” actually mean. You must tell the database the order of your rows. Without it, “previous” is a meaningless concept, and the database will rightly refuse to humor you.

A Concrete Example: Calculating Day-over-Day Growth

Let’s say we have a table daily_sales. Looking at raw daily numbers is okay, but what we really care about is growth. How much more (or less) did we sell today compared to yesterday? This is a classic LAG use case.

SELECT
    sale_date,
    amount,
    LAG(amount) OVER (ORDER BY sale_date) AS previous_day_amount,
    amount - LAG(amount) OVER (ORDER BY sale_date) AS daily_growth
FROM daily_sales
ORDER BY sale_date;
sale_dateamountprevious_day_amountdaily_growth
2023-10-01100NULLNULL
2023-10-0215010050
2023-10-03125150-25
2023-10-0420012575

See what happened? For the first row (2023-10-01), there is no previous row to look at, so LAG returns NULL. This is why the daily_growth is also NULL. This isn’t an error; it’s correct behavior. You can’t calculate a difference from nothing. We’ll talk about how to handle this gracefully in a second.

Partitioning: The “Reset” Button

Let’s make it more interesting. What if we have sales for multiple products? We don’t want to compare the sales of Product B yesterday to Product A today. That’s nonsense. We need to calculate the growth per product. This is where PARTITION BY saves the day.

SELECT
    product_id,
    sale_date,
    amount,
    LAG(amount) OVER (
        PARTITION BY product_id
        ORDER BY sale_date
    ) AS prev_amount
FROM sales
ORDER BY product_id, sale_date;

Now, LAG will only look at rows with the same product_id. When it hits the first sale for a new product, prev_amount will be NULL again, neatly resetting the calculation for the new product group. It’s like a little GROUP BY for your window function.

The Pitfalls and The “Gotchas”

  1. The Dreaded NULL: As we saw, the first row in a partition has no previous row, so LAG returns NULL. If you then try to do math with that NULL (e.g., amount - NULL), the result is always NULL. This can turn your beautiful trend line into a messy chart with a missing first data point. The solution? Use the default_value parameter or the COALESCE function. LAG(amount, 1, 0) would return 0 instead of NULL for the first row, so your calculation becomes amount - 0. Whether this makes sense for your use case is up to you. Sometimes NULL is the correct answer.

  2. The Absolutely Critical ORDER BY: I cannot stress this enough. The ORDER BY in your OVER() clause is what defines the sequence. Get this wrong, and your results are garbage. Ordering by an ambiguous column, or worse, forgetting the ORDER BY entirely, is a one-way ticket to incorrect-data ville. The database will sometimes let you do this, which is a design choice I find questionable. Always double-check your order.

LEAD: The Exact Same Thing, But Forward

Everything we’ve said about LAG applies to LEAD. It’s just the opposite direction. Instead of looking at the previous row, it looks at the next one. It’s perfect for calculating “next value” or looking ahead.

SELECT
    user_id,
    event_time,
    event_name,
    LEAD(event_name) OVER (
        PARTITION BY user_id
        ORDER BY event_time
    ) AS next_event
FROM user_events;

This query would show you what a user did immediately after their current action. Powerful stuff for analyzing user journeys. And yes, for the last row in the partition, LEAD will return NULL.

Why This Beats a Self-Join

You could theoretically do this with a self-join, aliasing the table to itself and joining on a.id = b.id - 1. It’s a mess. It’s inefficient. It falls apart completely once you introduce PARTITION BY. Window functions are declarative—you tell the database what you want (“the previous value”), not how to do it. The optimizer can then choose the most efficient way to execute it, which is almost always faster and cleaner than the procedural self-join nightmare. Use the right tool for the job. This is it.