10. Reading the home timeline with celebrities

Chapter 7’s home-timeline read assumes every tweet you should see has already been fanned out to your home_timeline:{user_id} list. Chapter 9 broke that assumption: celebrities don’t fan out. Their tweets live only in the per-author user:{user_id}:tweets cache the fanout worker writes to (chapter 9). So the home-timeline read needs to know which of the people you follow are celebrities, pull their recent tweets at read time, and merge.

Architecture

flowchart TD
Client(["Client"])
TLS["Timeline Service
(Fargate)"]
MC[("Redis
my_celebs:{user_id}")]
RPUSH[("Redis
home_timeline:{user_id}
(from chapter 7)")]
RCEL[("Redis
user:{user_id}:tweets
(per celeb)")]

Client -->|"GET /home_timeline"| TLS
TLS -->|"read set"| MC
TLS -->|"read range (pushed)"| RPUSH
TLS -->|"read newest per celeb"| RCEL
TLS -->|"merge + paginate"| Client

The my_celebs set

The merge needs “which of the accounts I follow are celebrities” on every home-timeline read. Walking the full follow list and checking is_celebrity on each one is too expensive — a user following 2000 accounts would pay for 2000 cache lookups per timeline read.

Maintain it as a per-user Redis set:

Key:   my_celebs:{user_id}
Type:  Redis set
Value: user_ids of followed accounts where is_celebrity = true

Three writers keep it in sync, all driven off events the system already emits:

On read, a single read of my_celebs:{me} returns the IDs the merge needs.

The merge

my_celebs   = read_set("my_celebs:" + me)
home_timeline = merge(
  read_range("home_timeline:" + me, 0, 19),     // pushed
  union(celebrity.recent(c) for c in my_celebs) // pulled
)

celebrity.recent() reads the newest N entries from the per-celeb sorted set user:{c}:tweets — the cache populated by the fanout worker in chapter 9. In steady state it almost always hits Redis, so the hot DynamoDB partition rarely sees traffic.

Both sides are time-sortable (Snowflake IDs encode timestamp), so cursor pagination across the merge is straightforward: the merged page’s next_cursor is just the oldest tweet ID returned, regardless of which side it came from. The merge re-runs on the next page request — no stitched cursor state.