First pass — the smallest thing that stores and returns a value durably, before adding anything for latency.
client → 1 LB → app ×4 → sharded db (RF 3, quorum). The write path is correct: acked writes are on a majority, so RPO 0 holds. No cache yet — every read hits the shards.
Two things I know are missing and would add next, in order:
Leaving it stripped-down on purpose — easier to argue about what each addition buys.
Sign in to join the discussion.
Yep, second LB is v2, cache is v3. Wanted the durable baseline on record first.
Honest baseline. Everything behind the LB survives an AZ loss and then the LB doesn't — classic.
Single LB kills the 99.95% SLA on its own — good that you flagged it as the v2 fix instead of hiding it.
Yep, fixing the LB is v2. Wanted the baseline on record first.
Everything *behind* the LB survives an AZ loss and then the LB doesn't — classic. Good that you called it out instead of hiding it.
The one LB is the whole ballgame for the availability SLA — 99.95% with a single front door is basically impossible, one reboot blows your error budget. Bump it to ×2 and this is solid.