System Design · Consistency & Replication · 2026-06-07

Read-Your-Writes Consistency: Guaranteeing You See Your Own Changes

Core concept
Read-your-writes (RYW) consistency is a session-level guarantee that after a client writes a value, any subsequent read by that same client will reflect that write — even in an eventually consistent, replicated system. Without it, a user could post a comment and then immediately not see it, creating a jarring experience. Systems achieve RYW by routing reads to the primary after a write, tracking replication lag with version vectors, or using sticky sessions that pin a client to a replica that has caught up. It is weaker than strong (linearizable) consistency but much cheaper, since only the writing client needs the guarantee, not all clients globally.

Diagram

Client writes → Primary
                  │
                  ▼ replication (async)
            ┌─────────────┐
            │  Replica A  │ ← may lag
            └─────────────┘
            ┌─────────────┐
            │  Replica B  │ ← caught up ✓
            └─────────────┘

RYW: route client's next read
  to Primary OR Replica B
  (verified via write token/LSN)

Concrete real-world example
DynamoDB implements this via strongly consistent reads optionally per-request, or by passing a write's VersionId and waiting for a replica's log sequence number (LSN) to advance past it before serving the read. Facebook's social graph (TAO) uses read-after-write by pinning a user's reads to the regional master for a short window (≈ a few seconds) after any write, then releasing them back to local replicas.

One trade-off / gotcha
Pinning post-write reads to the primary or a specific replica re-introduces the hotspot problem you distributed reads to avoid. If a high-write user (or a thundering herd after a popular post) forces all their reads to hit the primary, you can overload it. The mitigation is time-bounding the pin (e.g., only route to primary for 500 ms post-write) or using LSN-based replica selection so reads can go anywhere that has caught up, spreading the load.

An interview-style question to ponder
A user updates their profile photo and immediately refreshes their profile page — but sees the old photo. Your system uses async replication to 5 read replicas. Walk through two different architectural approaches to guarantee RYW consistency here, and explain what each costs in terms of latency, throughput, and operational complexity.

Stuck? Show a hint

Remember the guarantee only applies to the writing client seeing their own write — not to everyone globally. So narrow your focus to that one client's very next read: how do you force it to land on a replica that already has the change? Two levers exist — where you route the read, and waiting until a chosen replica has caught up.

Show answer

Guarantee RYW with one of two routing strategies: briefly pin the user's reads to the primary, or gate each read on a version token — the choice trades operational complexity against read scalability.

Approach 1 — pin post-write reads to the primary (sticky session): for a short window after the update, route that user's reads to the master so they always see their own write. Simple and low-latency, but it re-concentrates load on the primary, so time-bound the pin (e.g. 500 ms) or a write-heavy user overloads the master.
Approach 2 — LSN/version-token tracking: the write returns a log sequence number, and the next read only goes to a replica whose applied LSN has passed that token (else fall back to the primary or briefly wait).
Comparing the cost: Approach 1 sacrifices throughput for low complexity; Approach 2 keeps reads spread across all caught-up replicas but adds real machinery — per-replica lag monitoring, a no-replica-caught-up fallback, and occasional tail latency while waiting.
But why not just make every read strongly consistent? That gives the guarantee globally yet is far more expensive — RYW only needs the writing client to see their own write, so paying for global linearizable reads wastes throughput for everyone else.
Watch out: both approaches push load back onto the primary on the pin/fallback path, so keep the window tight or a thundering herd after a popular post can melt the master.