1. Clarify the requirements

Functional requirements

Pick a small core. A safe set:

Post a tweet (text + optional image/video)
Follow / unfollow a user
Home timeline — tweets from people you follow, reverse-chronological

Three flows is enough to force every interesting tradeoff: read/write ratio, fanout vs. pull, the celebrity problem, and the data model. Explicitly defer the rest unless the interviewer pushes: likes, user timelines, search, trending, DMs, retweets/quotes, replies, hashtags, notifications, ads, analytics. Don’t pretend you’ll build all of it.

Non-functional requirements

These shape the architecture more than the features do.

Read-heavy: timeline reads dominate writes.
Latency: timeline load < 200ms p99.
Availability over consistency: stale timelines are fine; outages are not. Eventual consistency for the timeline is acceptable.
Durability: tweets must never be lost once acknowledged.

Core entities

Before sizing or APIs, name the objects you’re modeling. Fields and storage come later (chapter 4) — for now, just the nouns:

User — the account: id, handle, profile.
Tweet — a post authored by a user, optionally with media.
Follow — a directed edge from follower to followee.
Home Timeline — a derived, ordered list of tweets from people a user follows.

Timeline isn’t a stored entity in the same sense as the others — it’s computed. Calling it out now is what makes the fanout discussion (chapter 6) meaningful later.

Core entities at a glance

flowchart TD
User(["User"])
Tweet(["Tweet"])
Follow(["Follow"])
HT(["Home Timeline
(derived)"])

User -->|"posts"| Tweet
User -->|"creates"| Follow
Tweet -->|"feeds"| HT
Follow -->|"determines"| HT