Sharded cache up front, 3-way replicated store behind

Same skeleton as the canonical answer but provisioned for the evening peak, not the daytime average, with an extra edge layer for the very hottest keys.

cdn (edge) — the handful of keys hot enough to be worth it get served straight from the edge with a short TTL, so they never touch origin. Everything else falls through.
cache ×3 — one per AZ, so an AZ loss still leaves two warm cache nodes instead of dumping the full read miss rate onto the shards.
app ×6 — generous, but app nodes are cheap relative to a latency-SLA miss at p99.
db: 3 shards, RF 3, quorum — keeps per-shard write load comfortably under the db.large write ceiling while holding RPO 0.

The trade is cost. If this were my bill I'd probably drop app to 4. For an interview I'd rather show I sized to the peak.

5 Comments

Gossip Gil@gossip_gilJul 1, 2026

app ×6 feels generous but I'd rather over-provision compute than miss the p99 during peak. Reasonable call.

TTL Tina@ttl_tinaJul 1, 2026

Per-AZ cache placement is a nice touch. One nit: cache.small ×3 vs one bigger node is a real cost/HA tradeoff worth a sentence.

Elena Petrova@elena_petrovaJul 1, 2026

Sizing to the 5× peak instead of the average is the right instinct — under-provisioning is the classic evening-peak outage.

Quorum Queen@quorum_queenJun 30, 2026

Per-AZ cache placement is underrated. Most people draw one cache box and call it HA.

Tom Schneider@tom_schneiderJun 30, 2026

app.xlarge ×6 is ~16k QPS of compute for 2.5k of surviving writes + cache misses. I think you could halve it and spend the savings on a second cache tier region.