Anchor numbers
Start with one concrete user count. Pick something defensible and derive everything from it:
- DAU: 300M
- Tweets per user per day: 0.5 (most people lurk)
- Timeline reads per user per day: 50
Everything else falls out.
Write QPS
300M users × 0.5 tweets/day = 150M tweets/day
150M / 86,400s ≈ 1,700 tweets/s average
Peak ≈ 3× average ≈ 5,000 tweets/s
Five thousand writes per second is small for any modern datastore — a managed DynamoDB table or a single well-tuned Postgres handles it without breaking a sweat. The interesting problems are reads and storage, not writes.
Read QPS
300M × 50 reads/day = 15B reads/day
15B / 86,400 ≈ 175,000 reads/s average
Peak ≈ 500,000 reads/s
Half a million reads per second is the number that drives everything: caching, fanout, read replicas.
Read:write ≈ 100:1. Keep this in your head.
Storage
Per tweet:
- Text: 280 chars ≈ 280 bytes
- Metadata (id, user_id, timestamps, view counter): ~300 bytes
- Total: ~600 bytes, round to 1 KB
150M tweets/day × 1 KB = 150 GB/day of tweet text
× 365 = ~55 TB/year
Manageable. But media dominates:
~10% of tweets have media, average 200 KB
15M media tweets/day × 200 KB = 3 TB/day
× 365 = ~1 PB/year
Media dwarfs text by ~20×. It needs a different storage path than tweet records.
Bandwidth
Outbound timeline traffic:
500K reads/s × 20 tweets/page × 1 KB ≈ 10 GB/s of text
Plus media served by CDN — orders of magnitude more, but that’s the CDN’s problem, not your origin’s.
Numbers at a glance
flowchart TD DAU["300M DAU"] W["~5K writes/s tweet creation"] R["~500K reads/s home timeline"] S1["~55 TB/yr text storage"] S2["~1 PB/yr media storage"] BW["~10 GB/s outbound text"] DAU --> W & R W --> S1 & S2 R --> BW