3. Posting a tweet

Architecture

flowchart TD
Client(["Client"])
GW["API Gateway
(AWS API Gateway)
rate limiting"]
TS["Tweet Service
(Fargate)
mints Snowflake ID"]
DB[("DynamoDB
— tweets —")]

Client -->|"POST /v1/tweets"| GW
GW --> TS
TS -->|"PutItem"| DB
TS -->|"201 + tweet_id"| Client

The endpoint

POST /v1/tweets
Idempotency-Key: 7f3a...
{
  "text": "hello",
  "media_ids": ["m_abc", "m_def"]
}
201 { "id": "t_123", "created_at": "..." }

A few things to call out:

Generating the tweet ID

The Tweet Service mints a Snowflake ID. A Snowflake is a 64-bit integer: [timestamp | machine_id | sequence].

Three properties matter:

  1. Globally unique without coordination — each shard generates its own.
  2. Roughly time-sortable — the timestamp is the high bits, so ORDER BY id DESCORDER BY created_at DESC. Cursor pagination becomes trivial (covered in the home timeline chapter).
  3. Compact — 8 bytes vs 16 for a UUID, which matters when timeline rows reference millions of them.

Don’t expose internal IDs as auto-increment integers. Use opaque Snowflake strings on the wire.

Where the tweet lands

DynamoDB. One table:

tweets
  partition key: user_id
  sort key:      tweet_id        -- Snowflake, time-sortable
  attributes:    text, media_ids, created_at

The only read pattern in scope is “get a user’s recent tweets” — needed if the home timeline pulls from followees at read time (chapter 6). Partitioning by user_id with tweet_id as the sort key makes that a single partition read in reverse sort order.

Why DynamoDB over sharded Postgres

55 TB/year of text means you’re sharding from day one either way, so the question is who runs the shards. Sharded Postgres is defensible — you keep per-shard secondary indexes you can add later without pre-declaring a GSI, and per-shard transactions for things like “insert tweet + bump users.tweet_count” when the user’s row lives on the same shard.

The catch is operational. Running sharded Postgres yourself means you own the shard map, the routing layer, resharding (consistent hashing or virtual shards), per-shard primary/replica failover, backups per shard, schema migrations applied N times, and hot-shard rebalancing when a celebrity blows up one node. DynamoDB hands you all of that as a managed service: partitions split automatically, replication is built in (synchronous to three AZs before ack), and there’s nothing to fail over. Unless you have a strong reason to run your own database, that ops bill alone usually decides it.

What the user actually sees

The synchronous path is short:

  1. API gateway → Tweet Service.
  2. Tweet Service mints Snowflake ID, writes to DynamoDB.
  3. Returns 201 with the new tweet.

That’s it. The client already has the full tweet — it came back in the 201 response — so it prepends it to the local view optimistically. No re-read against DynamoDB on the post path, which sidesteps the eventual-consistency window on the tweets table. Subsequent profile loads read by user_id from the tweets table directly. Followers seeing it in their home timeline is a separate problem — fanout, two chapters from now.