← all lessons
System Design · Storage & Indexing ·

Bloom Filters: Skipping Disk Reads You Know Will Miss

Core concept
A Bloom filter is a compact, probabilistic data structure that answers one question with certainty: "Is this item definitely not in the set?" If it says no, you can skip the expensive lookup entirely. If it says yes, the item is probably there—so you still do the real lookup to confirm. It achieves this by hashing each item into several bit positions in a fixed-length bit array and setting those bits to 1; a query checks if all those bits are 1. The magic is that false negatives (missing something that's there) are impossible, while false positives (claiming something's there when it isn't) are rare and tunable.

flowchart LR
    A[Incoming Key] --> B[Hash Functions x3]
    B --> C[Bit Array]
    C --> D{All bits = 1?}
    D -->|No| E[Definitely Missing - Skip Disk]
    D -->|Yes| F[Probably Present - Check Disk]

Concrete real-world example
RocksDB (a write-optimized key-value storage engine used inside many databases) stores data in immutable files called SSTables (sorted-string tables: read-only sorted key-value files). When you read a key, it might span dozens of SSTables. Without a Bloom filter, each file means a disk seek. With one, each SSTable carries its own filter: the read path asks the filter first and skips any file that says "definitely not here." In practice, a single-key lookup that would have touched 20 files now typically touches 1, because most files report a guaranteed miss instantly—from memory, with no disk I/O.

One trade-off / gotcha
Bloom filters cannot remove items. Once a bit is set to 1, you can't know which of several keys set it, so clearing it might invalidate others. This means if you delete keys, the filter can lie indefinitely—claiming deleted keys are present—wasting one disk lookup per false positive. The fix is to rebuild the filter periodically or use a Counting Bloom filter (a variant storing small counters instead of single bits, allowing decrements), which costs roughly 4× more memory.

An interview-style question to ponder
You're designing a distributed key-value store where each node holds ~500 million keys across thousands of SSTable files. You want to cut read amplification (the number of files checked per lookup) by 90% using Bloom filters. Your memory budget for filters is 1 GB per node. How do you size your Bloom filters, and what false-positive rate is realistic at that budget?

Stuck? Show a hint

The key tension is bits-per-key: more bits mean fewer false positives but consume more memory. Start from your total key count and your memory cap to derive how many bits you can afford per key, then look for the sweet spot on the false-positive curve.

Show answer

At 1 GB for 500 million keys, you have about 16 bits per key, which yields a false-positive rate of roughly 0.02% (about 1 in 5,000 lookups hits a ghost disk read).

  • Each key is hashed into k bit positions. The optimal number of hash functions for a given bits-per-key b is k = b × ln(2) ≈ 0.693 × b. At 16 bits/key that's ~11 hash functions, giving a false-positive probability of (0.5)^11 ≈ 0.049%, close to 0.02%—well within one wasted disk read per ~5,000 lookups.
  • If your p99 read latency matters, that rate is negligible: at 10,000 reads/sec, you'd hit a false positive roughly twice per second, each costing one extra disk seek (~1 ms). That's 2 ms of wasted latency per second system-wide—usually acceptable.
  • To cut read amplification by 90%, you need filters to eliminate 9 of 10 SSTable checks. Even a 1% false-positive rate achieves this (99% of irrelevant files are skipped), so 16 bits/key is actually generous—you could back down to ~10 bits/key (~0.8% FP rate) and free up 375 MB if memory pressure grows.
  • But why not just keep one giant in-memory index of all keys instead? A full index of 500 million keys, each needing ~20 bytes for the key + pointer, costs ~10 GB—10× your budget—and must stay sorted for lookup, adding complexity. The Bloom filter trades perfect accuracy for a 10× memory reduction.
  • Watch out: if key distribution is skewed and hot SSTables get queried constantly, even a 0.02% false-positive rate compounds under high QPS—monitor actual disk reads per SSTable to verify the filter is doing real work.