Database Replication | System Design Library

Why Replicate?

High Availability: If the primary database fails, a replica promotes and takes over
Read Scalability: Distribute read queries across multiple replicas
Geographic Distribution: Keep data close to users in different regions
Backup: Replicas can serve as live backups
Analytics: Run heavy analytical queries on a replica without impacting production

Leader-Follower Replication (Primary-Replica)

The most common replication pattern. One node (the leader/primary) accepts all writes. Changes are replicated to one or more follower/replica nodes. Reads can go to any replica.

Writer → [Primary DB]
                ↓ replication
           [Replica 1]  ← Readers
           [Replica 2]  ← Readers
           [Replica 3]  ← Readers

Replication Lag: Replicas may be slightly behind the primary. This lag is usually milliseconds but can grow under heavy write load.

Synchronous Replication

The primary waits for at least one replica to acknowledge each write before confirming to the client.

Pros: No data loss on primary failure (replica has all data) Cons: Higher write latency; if a replica is slow, the primary slows down

Asynchronous Replication

The primary writes locally and acknowledges to the client immediately. Replication happens in the background.

Pros: Lower write latency; replicas can lag without affecting write performance Cons: If primary fails before replication, the replica is missing recent writes (data loss)

Most systems use asynchronous replication and accept a small risk of data loss for better performance.

Multi-Leader Replication

Multiple nodes accept writes. Each node replicates to the others.

Use cases:

Multi-datacenter deployments (one leader per datacenter)
Offline-first applications (client is a leader while offline)

Key challenge: Write conflicts — two leaders can accept conflicting writes to the same record simultaneously. Requires a conflict resolution strategy:

Last-write-wins: Risky (can lose data)
Application-defined merge: Complex but flexible
CRDTs (Conflict-free Replicated Data Types): Automatic merging for specific data structures

Leaderless Replication

All nodes accept writes. Used by Cassandra, DynamoDB, and Riak.

Writes go to multiple nodes (quorum write: W nodes must acknowledge). Reads query multiple nodes (quorum read: R nodes must respond). With W + R > N (total nodes), you're guaranteed at least one node has the latest data.

Typical settings with N=3:

Strong consistency: W=2, R=2
High availability: W=1, R=1

Failover

When the primary fails, a replica must be promoted to primary. This is called failover.

Manual failover: An operator manually promotes a replica. Safer but slower. Automatic failover: The system detects failure and promotes automatically. Faster but can cause split-brain.

Split-brain: Two nodes both think they're the primary. Writes go to both, creating a conflict. Use an odd number of nodes with leader election (Raft, Paxos, etcd) to prevent this.

Read-Your-Writes Consistency

A common consistency challenge: you write data, then immediately read it — but get the old value because you hit a replica that hasn't caught up.

Solutions:

Read from the primary for a short window after a write
Track the replication offset and wait for the replica to catch up
Route reads to the primary for data the user just modified

Interview Tips

Leader-follower replication is the answer for scaling reads — mention it whenever your design has a read-heavy workload
Always mention replication lag and how you'd handle stale reads in user-facing scenarios
Automatic failover is important for high availability SLAs (99.9%+) — mention tools like MySQL Group Replication, PostgreSQL Patroni, or cloud-managed databases that handle this
For multi-region architectures, discuss where the primary lives and the latency implications of cross-region writes

Database DesignDatabase Replication