The read path is the volume side of the system — 500K reads/s vs 5K writes/s. Every decision here is about keeping the hot path off DynamoDB.
Architecture
flowchart TD
Client(["Client"])
GW["API Gateway
(AWS API Gateway)"]
TLS["Timeline Service
(Fargate)"]
RT[("Redis
home_timeline:{user_id}
(Fanout Worker writes here)")]
RC[("Redis
tweet:{tweet_id}
(hydration cache)")]
RU[("Redis
user:{user_id}
(author cache)")]
DT[("DynamoDB
— tweets —")]
DU[("DynamoDB
— users —")]
Client -->|"GET /v1/timeline/home"| GW
GW --> TLS
TLS -->|"read range (cursor)"| RT
TLS -->|"batch get (hydrate)"| RC
TLS -->|"batch get (author info)"| RU
RC -->|"cache miss"| DT
RU -->|"cache miss"| DU
The endpoint
GET /v1/timeline/home?cursor=eyJpZCI6...&limit=20
→ {
"tweets": [...],
"next_cursor": "eyJpZCI6..."
}
Cursor pagination, not offset
Offset means “skip N, take the next batch” — ?page=2&size=2 → skip 2, return items 3–4. That works on a static list, but a timeline isn’t static. Say the tweets are [A, B, C, D, E] (newest first) and you fetch page 1 → [A, B]. Before you fetch page 2, a new tweet Z arrives, making the list [Z, A, B, C, D, E]. Page 2 (skip 2) now returns [B, C] — you see B twice. A deletion causes the opposite: you skip a tweet.
A cursor pins the page to a specific point in the data, not a shifting index. The server hands back next_cursor with each page, encoding the last tweet_id — “older than tweet B.” Page 2 asks for tweets older than B and gets [C, D] regardless of what arrived at the top. Snowflake IDs (chapter 3) sort by time, so “older than B” is just “id less than B” — no separate timestamp needed.
The cursor is base64-opaque so clients don’t parse it — the server can change what’s inside without breaking anyone.
The three caches that make this fast
The read path is Redis → DynamoDB by key → response. No joins, no scans. That hits sub-200ms p99 because three separate caches sit on the path, each doing one job.
1. Home timeline cache
Per-user precomputed list of recent tweet IDs — written by the fanout worker (chapter 6).
Key: home_timeline:{user_id}
Type: Redis LIST
Value: [tweet_id_1, tweet_id_2, ...] most recent first
Size: ~800 entries × 8 bytes = 6.4 KB per user
For 300M users that’s ~2 TB total — fits in a Redis cluster sharded by user_id. Reads pull a 20-entry slice from home_timeline:{me} starting at the cursor offset. One round-trip.
The fanout worker maintains it and caps it at 800 entries — see chapter 6 for why 800.
Past the 800-entry cap
Cursor pagination eventually points “older than” a tweet that’s been trimmed. The simplest answer is to stop: return an empty next_cursor once the user has scrolled through all 800. Twitter historically did something close to this — there’s a hard ceiling on how far back the home timeline goes.
The alternative is to fall back to fanout-on-read for the tail: query each followee’s tweets older than the cursor and merge. It works but it’s slow and reintroduces the read amplification push was meant to avoid.
2. Tweet hydration cache
The home timeline cache only stores tweet IDs. Those have to be expanded into full tweet objects (text, author, media URLs) for the response. Hitting DynamoDB per tweet on every timeline read would defeat the point — you’d pay an RCU per item per read across hundreds of thousands of reads per second.
Key: tweet:{tweet_id}
Type: Redis HASH or serialized blob
TTL: 24h, refreshed on access
Read-through: Timeline Service batch-fetches the full set of tweet IDs in one call. Misses fall back to DynamoDB and populate the cache. Hit rate >95% in steady state because timelines are dominated by recent tweets.
3. User cache
Author info (display name, avatar URL, verified flag) is needed to render every tweet. Same shape:
Key: user:{user_id}
TTL: 1h
Invalidated on profile edit by publishing an event the cache layer subscribes to. Profile edits are rare; the invalidation cost is negligible.
Why a separate Timeline Service
Tweet writes (chapter 3) and timeline reads have:
- Different load profiles. 5K writes/s vs 500K reads/s. They scale on different curves. Co-locating them means scaling for the worst of both.
- Different failure tolerances. A timeline read failure can fall back to “show last cached page.” A tweet write failure must propagate — the user needs to know their tweet didn’t post. Different services let you set different SLOs and circuit breakers.