8. Caching strategy

Caching is what makes 500K reads/s feasible without burning a fortune in DynamoDB read capacity. Three caches matter.

1. Home timeline cache

Per-user precomputed list of recent tweet IDs.

Key:   home_timeline:{user_id}
Type:  Redis LIST (or sorted set)
Value: [tweet_id_1, tweet_id_2, ...]   most recent first
Size:  ~800 entries × 8 bytes = 6.4 KB per user

For 300M users that’s ~2 TB total — fits in a Redis cluster with hundreds of nodes. Sharded by user_id.

The fanout worker writes here on tweet creation: LPUSH home_timeline:{follower} {tweet_id} then LTRIM 0 799 to cap the list.

Reads: LRANGE home_timeline:{me} {cursor_offset} {cursor_offset+19}. One round-trip.

2. Tweet cache

Tweet IDs from the home timeline have to be hydrated into full tweet objects (text, author, counts, media URLs). Hitting DynamoDB for every tweet on every timeline read would defeat the point — you’d pay an RCU per item per read across hundreds of thousands of reads per second.

Key:   tweet:{tweet_id}
Type:  Redis HASH or serialized blob
TTL:   24h, refreshed on access

Read-through pattern: Timeline Service does MGET tweet:{id1} tweet:{id2} .... Misses fall back to the DB and populate the cache. Hit rate >95% in steady state.

3. User cache

Author info (display name, avatar URL, verified flag) is needed to render every tweet. Same shape:

Key:   user:{user_id}
TTL:   1h

Invalidated on profile edit by publishing an event the cache layer subscribes to.

Cold cache problem

When a Redis node restarts, every key on it is gone. Next read for any of those users falls through to DynamoDB, which gets stampeded.

Mitigations:

Cache invalidation

The two hard problems in computer science. For Twitter, mostly solved by structure:

If you find yourself designing complex invalidation, you’ve probably picked the wrong cache shape.

Cache layout summary

home_timeline:{user_id} → [tweet_ids]      (precomputed, fanout writes)
tweet:{tweet_id}        → tweet object     (read-through, 24h TTL)
user:{user_id}          → user object      (read-through, 1h TTL, invalidated on edit)
counts:{tweet_id}       → {likes, rts}     (stream-processed, eventually consistent)

Four key patterns, one Redis cluster (sharded), one purpose each. Don’t add a fifth unless a measured access pattern demands it.