20.5 Redis Pub/Sub and Sorted Sets for Leaderboards
Right, so you want to build a leaderboard. You’ve probably already realized that doing this with a traditional SQL database is a fast track to making your application’s database cry uncle under any real load. Sorting millions of rows on every page view? No, thank you. This is precisely the kind of problem Redis was born to solve, and its Sorted Set data structure is your new best friend. It’s basically a magic leaderboard-in-a-box.
Think of a Sorted Set as a collection of unique members (like user IDs), each one tied to a score. Redis automatically keeps this set sorted by the score, from smallest to largest. The magic is that this sorting is its default state; you don’t issue a costly ORDER BY operation. It’s always ordered. For a leaderboard, you’ll typically want the highest scores at the top, which is why we lean heavily on the ZREVRANGE command (the ‘REV’ is for ‘reverse’) to get things from highest to lowest.
The Absolute Basics of Sorted Sets
Let’s get our hands dirty. Here’s how you add members to a sorted set and then read them back in leaderboard order.
import redis
# Connect to your ElastiCache Redis node
r = redis.Redis(host='your.elasticache.endpoint.ng.0001.use1.cache.amazonaws.com', port=6379)
# Add some players to the leaderboard. ZADD adds members to a sorted set.
# The modern way is to provide score-member pairs.
r.zadd('game:leaderboard', {'player:123': 2500, 'player:456': 7500, 'player:789': 10000})
# Now, fetch the top 3 players (highest scores first). Withscores includes the score in the result.
top_players = r.zrevrange('game:leaderboard', 0, 2, withscores=True)
print(top_players)
# Output: [(b'player:789', 10000.0), (b'player:456', 7500.0), (b'player:123', 2500.0)]
Dead simple, right? But what if player 456 just scored a bunch of points and their new score is 8200? You just update their score with ZADD again. It will happily insert the new score and re-sort the set.
# Update a player's score. This is an upsert—it will add the member if it doesn't exist.
r.zadd('game:leaderboard', {'player:456': 8200})
# Let's check the new order.
new_top = r.zrevrange('game:leaderboard', 0, -1, withscores=True) # -1 means "to the end"
print(new_top)
# Output: [(b'player:789', 10000.0), (b'player:456', 8200.0), (b'player:123', 2500.0)]
Handling Ties and Common Pitfalls
Life isn’t fair, and neither are leaderboards. What happens if two players have the same score? By default, Redis sorts tied members lexicographically (i.e., alphabetically). So a player with an ID of aaa would rank above zzz if they had the same score. This is often not what you want. The best practice is to design your score to inherently avoid ties for ranking purposes.
A brilliant trick is to use a composite score. Instead of storing just the raw points, store a number that combines points and a timestamp. For example, if your score is an integer, you could do: final_score = (actual_points * 10000000000) + (max_timestamp - current_timestamp). The massive multiplier for the points ensures the points are the dominant factor, while the timestamp ensures that for two identical point values, the player who achieved it later (and thus has a smaller subtracted timestamp value) will rank higher. It’s a bit gnarly, but it works. The main pitfall here is integer size limits, so you need to be mindful of your possible score range.
Finding a Specific Player’s Rank
You’ve got the top 10, but a user wants to see their own standing. You don’t want to paginate through 10 million users to find them. Luckily, this is an O(log(N)) operation for Redis.
# Get a player's rank (0-indexed, from the top)
player_456_rank = r.zrevrank('game:leaderboard', 'player:456')
print(f"Player 456 is ranked #{player_456_rank + 1}") # Because humans don't think 0-indexed
# Get their specific score
player_456_score = r.zscore('game:leaderboard', 'player:456')
print(f"With a score of: {player_456_score}")
Expiring Your Leaderboards
You’re probably not running a single, eternal leaderboard. You have daily, weekly, seasonal ones. You might be tempted to use KEYS *leaderboard* and then DEL them. Don’t. KEYS is a performance nightmare on production databases. The right way is to use a TTL (Time to Live), but here’s the catch: Sorted Sets are not eligible for TTLs. I told you they’d make questionable choices.
The workaround is to use a naming convention and then use the SCAN command to iterate safely, or even better, use a Lua script to manage the deletion atomically. For example, name your keys leaderboard:weekly:2023-52. When the week is over, your application logic can safely delete that specific key without any performance hit. It’s a manual process, but it’s robust.
# Safe deletion of a known key
r.delete('leaderboard:weekly:2023-52')
Sorted Sets are incredibly powerful, but they demand you think about your data model upfront. Get the score right, plan your key expiration strategy, and you’ll have a blazing-fast leaderboard that can handle anything your users throw at it.