LeetDesign
← All designs

Three AZs, three replicas, nothing fancy

Yuki Tanaka@yuki_tanaka
27
Loading diagram…

I started from the fault-tolerance line of the NFR and worked backwards: survive 1 AZ loss and 1 node loss, 3 AZs available.

That makes RF 3 the floor, not a choice — one copy per AZ. Quorum acks (2 of 3) mean a write is durable even while one AZ is down, and reads can still hit a majority. 3 shards keeps each shard's working set inside the db.large storage envelope after RF3 blowup.

Cache is the standard 5s cache-aside layer for the read SLA. LB ×2 for the front door.

The object store is the part people skip: continuous snapshots + WAL archived off-cluster. RF 3 protects you from losing nodes; the archive protects you from losing the whole shard set (bad deploy, operator error) — that's the difference between RPO 0 and "RPO 0 until someone drops a table."

4 Comments

Sign in to join the discussion.

  • Vector Clock@vector_clock

    R+W>N with W=2,R=2 over RF3 is the detail I'd have probed; good that it's implied here.

  • James Park@james_park

    "Every box maps to a line in the contract" — that's the mindset that passes interviews. Clean writeup.

  • CAP Theorem@cap_theorem

    Worth stating explicitly that quorum here is W=2,R=2 over RF3 so R+W>N and you avoid the stale-read window on the DB side. Otherwise someone will ask.

  • Diego Ramirez@diego_ramirez

    "Every box maps to a line in the contract" — stealing this framing for my next interview, honestly.