7. Presence: online, last seen, and typing

Presence is everything that lives for seconds: whether a user is online right now, when they were last seen, and whether they are currently typing in a chat. None of it belongs in the message store. None of it should fan out to a user’s entire contact list on every state change. Build it as its own thin system.

Online and last seen

Architecture

flowchart TD
GW["Edge Gateway
(EC2, WebSocket)"]
PS["Presence Service
(Fargate)"]
Cache[("Presence Store
(Redis)
hash per user, TTL")]
Subs[("Subscriptions
(Redis)
set per target, TTL")]
GW2["Other Gateways
(EC2, WebSocket)"]

GW -->|"① connect / heartbeat"| PS
PS -->|"② set online, TTL"| Cache
PS -->|"③ lookup subscribers"| Subs
PS -->|"④ push to viewers"| GW2

When a client connects to a gateway, the gateway tells the Presence Service that the user is online. The Presence Service writes user_id → { state: online, ts: now } to Redis as a hash per user (presence:{user_id}) with a short TTL — say 60 seconds. The gateway sends a heartbeat every ~30 seconds to refresh the TTL. If the heartbeats stop (the socket dies, network drops), the key expires and the user is implicitly offline; last_seen is the last heartbeat timestamp.

No durable write happens for going online or offline. Presence is ephemeral by definition — if the data centre cold-starts, every user is briefly “offline” until their next heartbeat, and that’s correct.

The subscription model

Naive implementation: when user A goes online, push that fact to every contact who has A in their address book. For a heavy user with thousands of contacts, every connect/disconnect storms the network. Don’t do this.

The right model is pull-on-view, then subscribe. When user B opens A’s chat, B’s client tells the Presence Service “subscribe me to A.” The service:

Returns A’s current presence on the spot.
Records B → A in Redis as a set per target (subs:{A} → {B, …}) with a TTL refreshed while B has the chat open.
Whenever A’s state changes, the service reads the set and pushes STATUS frames only to those members.

When B leaves A’s chat, the client sends an unsubscribe and the entry is removed. A user’s contact list is potentially huge; the set of people actively staring at their chat is small — usually one or two people.

This is the same logic that makes typing indicators tractable.

Typing

Architecture

flowchart TD
Client_A(["Client A"])
GW_A["Edge Gateway A
(EC2, WebSocket)"]
PS["Presence Service
(Fargate)"]
Subs[("Subscriptions
(Redis)")]
GW_B["Edge Gateway B
(EC2, WebSocket)"]
Client_B(["Client B"])

Client_A -->|"① TYPING
   (every ~5s)"| GW_A
GW_A --> PS
PS -->|"② lookup subs"| Subs
PS -->|"③ push"| GW_B
GW_B --> Client_B

The client sends a TYPING frame when the user starts typing in a chat, and refreshes it every few seconds while typing continues. The receiver shows “typing…” as long as the most recent TYPING frame is fresher than the timeout (~6s).

There is no STOPPED_TYPING event. Two reasons:

Networks lose packets. A stop event that doesn’t arrive leaves the indicator stuck on. A timeout based on the last seen TYPING event self-heals.
Less wire traffic. No paired start/stop per keystroke burst.

Typing events are not stored anywhere. They go through the Presence Service, fan out to subscribers, and are forgotten.

Why this is its own system

Three things separate presence from messaging:

Ephemerality. Nothing is durable. The store is in-memory with TTLs. No write-ahead log.
Fanout shape. Messages go to a fixed recipient set (one person or a group). Presence goes to a dynamic, narrow set of current viewers.
Loss tolerance. A dropped TYPING frame is invisible. A dropped message is a bug. Different SLAs allow a much cheaper transport — best-effort, no pending queue, no retries.

Trying to share infrastructure with the message path drags presence into a durability and consistency model it doesn’t need, and inflates the cost of a feature that should be cheap.