Consistent Hashing: Distributing Load Without Mass Remapping
Core concept
Consistent hashing maps both nodes and keys onto the same circular ring (0–2³²), so each key is owned by the nearest node clockwise. When a node is added or removed, only the keys between that node and its predecessor need to be remapped — roughly 1/N of total keys, not all of them. This makes it ideal for distributed caches, databases, and load balancers where the node count changes frequently. Virtual nodes (vnodes) are added to spread each physical node across multiple ring positions, preventing hotspots from uneven key distribution.
Diagram
Ring (0 → 2³²)
0
┌────┴────┐
N3 ◄─┤ Ring ├─► N1
│ ● │
└────┬────┘
N2
↓
Key K hashes → lands here
→ walks clockwise → hits N1
→ N1 owns K
Concrete real-world example
Amazon DynamoDB and Apache Cassandra both use consistent hashing with vnodes. When you add a Cassandra node for capacity, only ~1/N of the data streams to the new node. Without consistent hashing, a naive modulo approach (key % N) would force a full cluster reshuffling — catastrophic at terabyte scale. Memcached client libraries (e.g., ketama) implement the same ring logic on the client side to route cache requests without a central coordinator.
One trade-off / gotcha
Vnodes solve uneven distribution but increase operational complexity — each node participates in many token ranges, so a single node failure triggers multiple small repair jobs across the ring rather than one clean handoff. Heavily skewed key hashing (e.g., celebrity user IDs hammering one arc) can still cause hotspots even with vnodes, requiring additional mitigations like key salting or request-rate limiting per node.
An interview-style question to ponder
You're designing a distributed cache for 10 billion keys across 50 nodes. A node crashes at 3 AM. Walk through exactly what happens to the keys it owned — who serves reads, how is data recovered, and what would break if you had used simple modulo hashing instead?
Stuck? Show a hint
Follow one dead node's keys around the ring: who becomes their owner clockwise, and where does that successor get the data to start serving them? Then, separately, ask how many keys have to move under plain key % N when N drops from 50 to 49 — that contrast is the whole answer.
Show answer
The roughly 1/50 of keys the dead node owned are now served by the next node clockwise for each range — clients walking the ring simply skip the dead node and hit its successor.
- How recovery works: because this is a cache, those successors take a wave of cache misses and repopulate from the backing database. A persistent store like Cassandra would instead stream replica copies from other nodes that already hold the data.
- Why vnodes matter here: the failed node's load is spread across many successors instead of being dumped entirely on one neighbor, which prevents a single hotspot from forming during the failover.
- But what if you'd used
key % 50? Dropping to 49 nodes changes the modulus and remaps almost every key, not just 1/50 — a near-total cache-miss storm that stampedes the database all at once. - Why that's catastrophic: the stampede can buckle the DB and cascade into a full outage, since every client misses simultaneously. Avoiding exactly this mass-remapping is the entire reason consistent hashing exists.
- Watch out: a celebrity/hot key can still hotspot one arc even with vnodes — mitigate with key salting or per-node rate limiting, since the ring alone won't fix skewed access.