Your game is a massive hit. Ten million players, one million playing at the same time.
But... there's a nightmare. Your real-time leaderboard for the top ten thousand players is getting hammered.
Your database query, a simple
SELECT * FROM scores ORDER BY score DESC LIMIT 10000
is running thousands of times per second.
Your database CPU is pegged at 100%. The entire game is starting to lag... for everyone. This is a code red situation that can kill your game's success.
So, how do we fix this? How do we design a leaderboard that is built for this massive scale?
The problem isn't your database. Your database, whether it's PostgreSQL or MySQL, is fantastic for storing data reliably. But asking it to sort millions of rows, thousands of times a second, is the wrong job for the tool.
That ORDER BY
clause on a massive, frequently updated table forces a full sort operation on disk or in memory. At scale, this leads to intense CPU load, I/O bottlenecks, and database locking. It simply cannot keep up.
We need a specialist.
That specialist is Redis, and our primary tool is a data structure called a Sorted Set.
A Sorted Set is an in-memory data structure that maintains a collection of unique members, where each member is associated with a score. The collection is, by its very nature, always kept sorted by that score.
So, how does this work under the hood, and why is it so fast?
A Redis Sorted Set is brilliantly implemented as a dual data structure: a hash table and a skip list.
The hash table maps members to their scores, which gives us fast, O(1)
lookups for a specific player's score.
The skip list, which is a probabilistic data structure similar to a balanced tree, stores the members sorted by score. This allows for incredibly fast insertion and updates, with a time complexity of O(log N)
, where N is the number of players in your leaderboard.
Let's look at the commands. When a player gets a new score, we send the command:
ZADD leaderboard <score> <player_id>
This O(log N)
operation means that even with 10 million players, updating a score is practically instantaneous.
Now, to fetch the top 10,000 players, we run:
ZREVRANGE leaderboard 0 9999 WITHSCORES
The complexity here is O(log N + K)
, where K is the number of players you're retrieving. We pay a tiny cost to find the starting element (the rank 0 player) and then just walk the already-sorted list to get the top 10,000. It's not a sort; it's a simple, fast read.
So what does our final production architecture look like?
A player gets a score. The game server sends a
ZADD
command to Redis. The real-time leaderboard is instantly updated.To ensure durability, we configure Redis persistence, like AOF, which logs every write operation to disk. This protects us from data loss if the server restarts.
We use a "write-aside" caching strategy. After writing to Redis, our application asynchronously writes the score to our persistent SQL database. The database is our system of record for long-term storage and analytics, but it is no longer on the critical path for real-time leaderboard requests.
All leaderboard reads from clients are served directly and instantly from Redis.
By offloading the real-time sorting to a specialized in-memory tool, we've solved our performance bottleneck. Our database CPU is stable, the game is fast, and the architecture is built to handle our massive success.
And that is how you architect for scale.
Also recorded a YT video explaining this :
Great expalanation