19.4 ROW_NUMBER, RANK, DENSE_RANK, and NTILE

Alright, let’s talk about the SQL equivalent of giving everyone in a room a numbered sticker. Window functions are, hands down, one of the most powerful tools you can add to your SQL toolkit. They let you perform calculations across a set of table rows that are somehow related to your current row. Think of them as a smarter, more sophisticated GROUP BY that doesn’t collapse your result set into a single summary row. You get to keep all your delicious, granular data and get aggregate-like insights. It’s like having your cake and eating it too, which is a saying I’ve never understood—what else are you supposed to do with cake? Stare at it?

The core of any window function is the OVER() clause. This is where the magic happens. It defines the “window” of rows your function is going to peer through. A bare OVER() means “look at every single row in the result set.” That’s often not very useful on its own, which is why we pair it with ORDER BY and PARTITION BY to make it actually intelligent. We’ll get to PARTITION BY in a bit; it’s the real star of the show.

ROW_NUMBER, RANK, and DENSE_RANK: The Sibling Rivalry

These three functions are often confused because they all deal with ordering things, but they handle ties very differently. Let’s say we have a table of sales_people with their name and sales figures.

SELECT
  name,
  sales,
  ROW_NUMBER() OVER (ORDER BY sales DESC) as row_num,
  RANK() OVER (ORDER BY sales DESC) as rank,
  DENSE_RANK() OVER (ORDER BY sales DESC) as dense_rank
FROM sales_people;

Imagine the output looks something like this:

name	sales	row_num	rank	dense_rank
Anya	1000	1	1	1
Boris	900	2	2	2
Chen	900	3	2	2
Darya	800	4	4	3

See what happened? Boris and Chen both sold 900 units, creating a tie for second place.

ROW_NUMBER() is the pedantic one. It must assign a unique number to each row, even if the values are identical. It just blindly increments by 1. There is no concept of a tie. This makes it perfect for pagination or just getting a unique identifier for each row in this specific ordering, but terrible for ranking where ties should be acknowledged.
RANK() is the one who plays by the rules of the school exam board. It says, “Boris and Chen are tied for 2nd place. Therefore, the next person, Darya, gets the 4th place.” It leaves a gap. This is the standard way most competitions work.
DENSE_RANK() is the more generous, feel-good version. It says, “Boris and Chen are tied for 2nd. The next distinct value, Darya, gets 3rd place.” No gaps. The ranks remain consecutive. This is useful if you’re, say, assigning grade tiers (A, B, C) and you don’t want a gap in the tier labels.

Which one to use? It’s not about which is “better,” it’s about the business logic. If you’re giving out one “Employee of the Month” prize, you need ROW_NUMBER. If you’re ranking students for a university application where 10 people might be tied for 1st, you probably want RANK. If you’re bucketing customers into consecutive loyalty tiers based on spending, DENSE_RANK might be your pick.

NTILE: For When You Absolutely Need Buckets

NTILE is your go-to for splitting a sorted dataset into a specific number of roughly equal groups. Want quartiles? NTILE(4). Pentiles? NTILE(5). Deciles? You get the idea.

SELECT
  name,
  sales,
  NTILE(4) OVER (ORDER BY sales DESC) as quartile
FROM sales_people;

It tries its best to make the buckets equal, but life is messy. If you have 10 rows and ask for 3 tiles (NTILE(3)), the bucket sizes will be 4, 3, and 3. The function handles the uneven distribution for you; the first bucket will just be slightly larger. It’s a fantastic, simple tool for basic percentile analysis.

The Power of PARTITION BY: Your Secret Weapon

Here’s where we go from “neat trick” to “mind-blowing superpower.” PARTITION BY splits your result set into independent groups before the window function is applied. Each group gets its own window, and the ranking (or numbering, or bucketing) restarts for each partition.

Let’s say our salespeople are divided into regions.

SELECT
  region,
  name,
  sales,
  RANK() OVER (PARTITION BY region ORDER BY sales DESC) as regional_rank
FROM sales_people;

Now your output might be:

region	name	sales	regional_rank
East	Anya	1000	1
East	Chen	900	2
West	Boris	900	1
West	Darya	800	2

Boom. Instead of one global ranking, we now have a separate ranking within each region. Chen is ranked 2nd in the East, and Boris is ranked 1st in the West. This is insanely useful for answering questions like “Who are the top 3 performers in each department?” or “What is the second-most popular product in each category?”

Pitfall Warning: The order of operations matters. PARTITION BY always happens before ORDER BY within the OVER() clause. The database first carves out the partitions, and then it sorts the rows within each partition according to your ORDER BY clause. Get this wrong, and your results will be nonsense. Also, don’t forget that the ORDER BY in your window function is not the same as the ORDER BY for the entire query. You can have one ordering for your ranks and a completely different ordering for your final result set. It’s a common source of confusion until you get used to it.

Mastering PARTITION BY with these ranking functions is the key to moving from writing simple queries to performing deep, multi-layered analysis. It feels a bit like cheating, but I promise, it’s not. It’s just you being smart.