PerformanceCaching Strategies

Caching stores copies of frequently accessed data in fast storage (memory) so future requests are served faster. Getting caching right is often the difference between a system that scales and one that doesn't.

Why Cache?

Database reads are slow (milliseconds). Memory reads are fast (microseconds). Caching exploits this 1,000x difference by storing hot data in RAM.

Benefits:

  • Reduced latency: Serve responses in <1ms instead of 50ms+
  • Reduced database load: Protect your database from read storms
  • Better throughput: Handle more requests with the same infrastructure

Cache-Aside (Lazy Loading)

The application is responsible for populating the cache.

1. Request comes in
2. Check cache → Cache HIT → Return cached value
3. Cache MISS → Read from DB → Write to cache → Return value

Pros:

  • Only caches data that's actually requested
  • Cache failures don't break the system (falls back to DB)
  • Flexible: application controls what gets cached

Cons:

  • Cache miss penalty: 3 trips (cache + DB + cache write)
  • Data can become stale if DB is updated directly
  • Cache stampede: many requests hit the DB simultaneously on a cold start

Best for: General-purpose read caching. The default choice.

Write-Through

The application writes to the cache and the database simultaneously (synchronously).

1. Write to cache
2. Write to database
3. Return success

Pros:

  • Cache is always in sync with the database
  • Read performance is excellent — cache is always warm

Cons:

  • Write latency is higher (must write to both)
  • Writes for data that's never read waste cache space

Best for: Systems that can't tolerate stale data, with frequent reads of recently written data.

Write-Behind (Write-Back)

The application writes to the cache immediately and asynchronously flushes to the database.

1. Write to cache
2. Return success immediately
3. [async] Flush to database in background

Pros:

  • Very low write latency
  • Can batch DB writes for efficiency

Cons:

  • Risk of data loss: If the cache node crashes before the flush, data is lost
  • More complex to implement correctly

Best for: Write-heavy workloads where some data loss is acceptable (analytics, metrics).

Read-Through

The cache sits between the application and database. On a miss, the cache itself fetches from the DB.

Pros: Simpler application code (the cache handles DB reads) Cons: First request for any key always incurs the miss penalty

Cache Eviction Policies

| Policy | How It Works | Best For | |---|---|---| | LRU (Least Recently Used) | Evict the item not used for the longest time | Most general-purpose workloads | | LFU (Least Frequently Used) | Evict the item used the fewest times | Workloads with stable popularity | | FIFO | Evict the oldest item added | Simple cases | | TTL | Evict items after a fixed time-to-live | Time-sensitive data |

Redis defaults to LRU when memory is full.

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Strategies for keeping the cache in sync:

  • TTL-based expiration: Set a max age. Accept that data may be slightly stale.
  • Event-driven invalidation: Invalidate cache keys when the underlying data changes.
  • Write-through: Keep cache always in sync by writing to both.

Common Pitfalls

Cache Stampede (Thundering Herd)

When a popular cache entry expires, many requests simultaneously hit the database. Solutions:

  • Probabilistic early expiration: Refresh cache slightly before it expires
  • Mutex/locking: Only one request fetches from DB; others wait
  • Background refresh: A background job refreshes cache before expiry

Hot Keys

A single cache key that gets millions of reads per second can overwhelm even a cache cluster. Solutions:

  • Replicate hot keys across multiple cache nodes
  • Local in-process caching for extremely hot data

Interview Tips

  • Always discuss what you're caching, TTL, and invalidation strategy — not just "add a cache"
  • Redis is the industry standard cache. Know it supports strings, hashes, sorted sets, pub/sub
  • Mention the risk of serving stale data and how you'd handle it for your specific use case
  • For user-facing data, a 60-second TTL is often a reasonable balance between freshness and performance