System Design · Consistency & Replication · 2026-06-23

Causal Consistency: Preserving "Happens-Before" Without Full Serialization

Core concept
When two writes are causally related — meaning one depends on the other — every reader must see them in that order, no matter which replica they hit. Causal consistency is weaker than full serializable ordering (where every operation appears as if it ran one-at-a-time in some global sequence), but stronger than eventual consistency (where replicas converge eventually but with no ordering guarantees). The key insight is that you only need to enforce order between operations that are actually related; concurrent, independent writes can be applied in any order safely. This makes causal consistency far cheaper to implement at scale because it skips the expensive global coordination that full serialization requires.

flowchart LR
    A["Alice posts photo"] -->|causes| B["Bob comments on photo"]
    B -->|causal dependency| C["Replica tracks version"]
    C -->|check before serving| D["Carol reads comment"]
    D -->|must also see| E["Alice's original photo"]

Concrete real-world example
Imagine a social feed. Alice posts a photo (write W1). Bob, seeing the photo, comments on it (write W2). W2 causally depends on W1 — it cannot make sense without it. Now Carol opens the app and hits a different replica. Under mere eventual consistency, Carol might see Bob's comment ("Great pic!") before Alice's photo has propagated, which is confusing and broken. Under causal consistency, the replica serving Carol is required to delay showing W2 until it has also applied W1. Systems like MongoDB (a general-purpose document database) and COPS (a geo-distributed key-value store) implement this by attaching causal tokens (small metadata snapshots listing which writes this operation depended on) to every write, and replicas refuse to apply a write until all its dependencies are locally present.

One trade-off / gotcha
Tracking causal dependencies has a hidden scaling cost: the metadata (the causal token) can grow unboundedly if it naively records every past write a client has observed. Real systems prune this by using version vectors (per-replica counters that compactly encode which writes have been seen) rather than explicit dependency lists. The gotcha is that pruning introduces subtle bugs if done incorrectly — you can accidentally discard a dependency you still need, causing exactly the out-of-order read you were trying to prevent.

An interview-style question to ponder
You are designing a collaborative document editor (think: shared Google Docs-style notes) that replicates across three data centers. Users in each region write frequently. A PM insists you provide causal consistency. A senior engineer pushes back, arguing you should just use full serializable consistency to keep the code simple. How do you reason through this trade-off, and what choice would you make?

Stuck? Show a hint

Start by estimating the coordination cost of serialization across three geographically distant data centers, then ask what causal consistency actually buys you versus what it gives up — focus on the difference between dependent writes and concurrent writes in a collaborative editor specifically.

Show answer

Causal consistency is the right choice here; full serialization across three data centers is prohibitively expensive for a write-heavy collaborative editor.

Full serialization requires a global ordering authority (typically a consensus protocol like Paxos — an algorithm where a majority of nodes must agree before any write commits). Achieving agreement across data centers separated by 80–150 ms of round-trip latency means every write takes at least one full cross-datacenter round trip, adding 100+ ms of latency to every keystroke. That kills the "feels live" experience.
In a collaborative editor, most concurrent writes are genuinely independent — two users typing in different paragraphs at the same time have no causal relationship. Causal consistency only orders the writes that must be ordered (e.g., inserting text, then formatting that specific text), and lets independent writes propagate freely without coordination. This keeps local writes fast (commit locally, replicate asynchronously) while still preventing the nonsensical "you see my reply before my original message" problem.
The implementation cost is real but manageable: attach a version vector to each document operation and have replicas buffer out-of-order operations in a small pending queue until dependencies arrive. In practice, dependencies arrive within one replication lag cycle (often under 50 ms within a region), so the queue rarely holds items for long.
But why not just use eventual consistency and skip causal tracking entirely? Because in a collaborative editor, out-of-order application of dependent operations causes document corruption — a formatting operation applied before the text it targets either fails silently or mutates the wrong characters. Causal ordering is the minimum safety bar for correctness, not a nice-to-have.
Watch out: Causal consistency only protects observed dependencies. If a user reads from replica A and then writes to replica B, the client must carry its causal token across that switch — if your client library drops the token on a failover, you silently lose your guarantees.