Voodoo AIVoodoo AI
Book a Consultation
Back to Architecture & Systems
Architecture 11 min read April 2026

Caching Strategies for Distributed Systems

Cache invalidation patterns, TTL tuning, and the consistency trade-offs every architect faces.

The Two Hard Problems

Cache invalidation and naming things are the two hard problems in computer science. Caching improves performance but introduces consistency challenges. The question is not whether to cache, but what to cache, for how long, and how to invalidate.

A cache hit takes microseconds; a database query takes milliseconds. For read-heavy workloads, caching can reduce database load by 90% and response times by 99%.

We approach caching with a decision framework. First, identify the access pattern: read-heavy vs write-heavy, uniform vs skewed, small vs large objects. Second, define the consistency requirement: eventual vs strong, tolerated staleness vs zero tolerance. Third, choose the pattern: cache-aside, read-through, write-through, or write-behind. Fourth, implement and measure: hit rate, latency, and consistency.

Cache-Aside vs Read-Through

Cache-aside leaves cache management to the application. Read-through delegates to the cache itself. Cache-aside offers more control but more code complexity. Read-through simplifies code but couples you to the cache provider. We prefer cache-aside for most applications because it preserves flexibility.

Cache-Aside Pattern: The application checks the cache first. If present (cache hit), return immediately. If not (cache miss), fetch from the database, write to the cache, and return. The application controls what to cache, how long to cache, and when to invalidate.
Read-Through Coupling: The application always reads from the cache. The cache handles misses transparently. The trade-off is coupling: the cache provider must support read-through, and you cannot easily switch providers.

TTL Tuning

TTL (time-to-live) determines how long data stays in cache. Too short: cache misses dominate, performance suffers. Too long: stale data persists, users see inconsistencies. We tune TTL by measuring hit rate and freshness requirements. Financial data: seconds. Product catalogues: hours. User profiles: minutes.

TTL by Data Type: User sessions: 30 minutes. Product prices: 5 minutes. Product descriptions: 24 hours. Static assets: 1 year. The key is data classification: categorise each data type by change frequency and consistency requirement, then assign TTL accordingly.

Cache warming is part of TTL strategy. When a cache entry expires, the next request suffers a cache miss and database query. For predictable access patterns (e.g., top products every morning), pre-warm the cache before peak hours. For unpredictable patterns, use stale-while-revalidate: serve the expired entry while fetching the fresh one in the background.

Invalidation Patterns

  • Time-based. Data expires after TTL. Simple but can serve stale data briefly. Best for data with predictable change patterns and tolerated staleness.
  • Event-based. Write operations publish invalidation events. Immediate but requires message infrastructure. Best for data with strict consistency requirements.
  • Write-through. Updates go to cache and database simultaneously. Consistent but slower writes. Best for write-heavy workloads where read consistency is critical.
Cache Stampede (Thundering Herd): When a popular cache entry expires, thousands of requests simultaneously hit the database. Solutions: probabilistic early expiration, request coalescing, and circuit breakers.

The choice depends on consistency requirements and infrastructure. Time-based is simplest but least consistent. Event-based is more consistent but requires message infrastructure. Write-through is most consistent but slowest for writes.

Our Recommendation

Start with time-based invalidation and conservative TTLs. Add event-based invalidation only for data with strict consistency requirements. Measure hit rates and tune continuously.

Caching is not a silver bullet. It solves read performance but introduces complexity: consistency, invalidation, and operational overhead. Use it where it matters: read-heavy workloads with predictable access patterns.

Voodoo AI Engineering Team

We have optimised caching for systems handling 100k+ requests per second.

Optimising performance?

We have optimised caching for systems handling 100k+ requests per second.

Book a Consultation