API Design

Design Case Studies · Lesson 13

Design: Social Feed API

Every tweet from a 50-million-follower celebrity has to land in 50 million inboxes. How you route that write — and when you give up on pre-computing it — is the central architectural decision in the social feed problem.

⏱ ~22 min advanced Prereq: cursor pagination, caching

By the end you'll be able to

1 — Requirements

Twitter-style systems look deceptively simple from the outside: post a message, follow people, read a feed. The constraints underneath are what make this a canonical distributed systems case study.

Functional requirements

Non-functional requirements

2 — Design decisions

The fan-out question

Think of a newspaper printing plant. One strategy: at press time, print a personalised edition for every subscriber and drop it in their mailbox — so when they wake up, it's already there, no wait. That is fan-out on write. The other strategy: print one master edition, and at the newsstand, cut custom pages for each reader on demand. That is fan-out on read.

Fan-out on write is fast to read — the timeline cache is pre-populated, so a GET is a single cache lookup. But writing is expensive: every post for an account with N followers triggers N cache prepend operations. For a celebrity with 50 million followers, that is 50 million writes per tweet. The cache tier cannot absorb that synchronously without cascading failures.

Fan-out on read is cheap to write — the tweet lands in one place (the author's post list). But reading is expensive: to build a timeline for a user who follows 500 accounts, you fetch 500 post lists, merge them by timestamp, and paginate. That merge cost grows with the social graph, and p99 latency for users who follow many accounts can blow past the 100 ms budget.

Hybrid routing: the practical answer

Neither pure strategy survives contact with a real social graph. The production approach is a hybrid keyed on follower count:

This caps write amplification at O(followers) for accounts below the threshold — and at O(1) for accounts above it — while keeping read latency bounded by a small, predictable number of celebrity fetches per read.

Cursor pagination (as-03)

Timelines are live: new tweets arrive between page requests. Offset pagination breaks on live data — offset=20 on your second request skips 20 rows counting from the current head, which has shifted since your first request, producing gaps or duplicates. A cursor encodes an opaque position (typically a tweet ID or timestamp) that is stable regardless of insertions ahead of it. The server uses the cursor to resume from exactly where the client left off. Never expose offsets on a feed endpoint.

Timeline caching (rel-07)

Each user's precomputed timeline lives as a Redis list — a sorted sequence of tweet IDs. On fan-out write, the worker prepends to the list. The list is capped at the last N entries (e.g. 800) to bound memory per user. TTL is ~24 hours; a cold read (cache miss) reconstructs from the database, which is expensive and should be rare. A cache hit is the overwhelmingly common path and takes roughly 2 ms.

Eventual consistency is the right trade-off

Fan-out is async: a post enters a queue, and workers process followers in batches over the next few seconds. This means a follower's timeline may lag 1–2 seconds behind a real-time post. That lag is acceptable — a home feed is not a financial ledger, and no safety or legal consequence follows from seeing a tweet a second late. The trade-off buys dramatically lower write latency and vastly simpler fan-out infrastructure.

Fan-out on Write pre-populate caches at write time Write fan-out worker User A cache User B cache User C cache Read → single cache fetch ~2 ms cache hit Writes: O(followers) Reads: O(1) Risk: celebrity write storms Fan-out on Read merge post lists at read time A posts B posts C posts merge + sort timeline Write → one post list insert ~5 ms write Writes: O(1) Reads: O(followed accounts) Risk: slow reads on dense graphs
Left: fan-out on write pre-populates every follower's cache at write time — reads are instant but writes are expensive. Right: fan-out on read merges post lists at query time — writes are cheap but reads slow down as the social graph grows.
write arrives check follower count ≤ 10 000 fan out to followers' caches > 10 000 store in celebrity post list only at read time: merge precomputed + celebrity lists
The hybrid strategy routes writes by follower count. Normal users get immediate cache fan-out; celebrity posts skip the fan-out entirely and are merged into the timeline at query time. Both paths converge at the reader.
⚠️ The write-storm trap

A 50-million-follower account posting once would trigger 50 million cache prepend operations if you apply fan-out on write without the celebrity exemption. That is not a spike — it is a crash. The hybrid cutoff (~10k followers) is the gate that keeps the cache tier alive. The exact threshold is tunable; the principle is: set it where synchronous fan-out becomes unsafe for your infrastructure.

3 — The API model

Post a tweet

POST /v1/tweets
Authorization: Bearer <token>
Content-Type: application/json

{
  "text": "Cursor pagination prevents ghost items on live feeds.",
  "reply_to": "tweet_xxx"    // optional — omit for top-level post
}

HTTP/1.1 201 Created
{
  "id":         "tweet_abc123",
  "author_id":  "usr_42",
  "text":       "Cursor pagination prevents ghost items on live feeds.",
  "reply_to":   null,
  "created_at": "2025-11-14T09:04:00Z"
}

The 201 confirms the tweet is durably stored. Fan-out to followers happens asynchronously — the response does not wait for it.

Read the home timeline

GET /v1/timeline?cursor=&limit=20
Authorization: Bearer <token>

HTTP/1.1 200 OK
{
  "tweets": [
    {
      "id":         "tweet_abc123",
      "author_id":  "usr_42",
      "text":       "Cursor pagination prevents ghost items on live feeds.",
      "created_at": "2025-11-14T09:04:00Z"
    }
    // …up to 20 items
  ],
  "next_cursor": "eyJ0d2VldF9pZCI6InR3ZWV0X2FiYzEyMyJ9",
  "has_more":   true
}

# Next page — pass the opaque cursor, not an offset
GET /v1/timeline?cursor=eyJ0d2VldF9pZCI6InR3ZWV0X2FiYzEyMyJ9&limit=20

Follow a user

POST /v1/follows
Authorization: Bearer <token>
Content-Type: application/json

{
  "target_user_id": "usr_99"
}

HTTP/1.1 201 Created
{
  "follower_id":  "usr_42",
  "followee_id":  "usr_99",
  "followed_at":  "2025-11-14T09:10:00Z"
}

Unfollow a user

DELETE /v1/follows/usr_99
Authorization: Bearer <token>

HTTP/1.1 204 No Content
✅ Why DELETE on the followee ID, not a follow ID

A follow relationship is naturally identified by the (follower, followee) pair. There is no semantic ambiguity in DELETE /v1/follows/usr_99 — it means "I stop following usr_99." You could use a follow object with its own ID, but that forces the client to retain a follow ID they never asked for. The resource-oriented path is simpler and unambiguous.

4 — Evaluation & latency budget

Write amplification analysis

Scenario Strategy applied Cache writes triggered
User with 500 followers posts Fan-out on write 500 prepend operations
User with 8 000 followers posts Fan-out on write 8 000 prepend operations
Celebrity with 50M followers posts Celebrity exemption — no fan-out 1 post list insert

Without the celebrity exemption, a single post from the largest account would require 50 million cache writes. Even at 10 µs per write, that is 500 seconds of cumulative cache work — serialised. The exemption collapses this to a single write, deferring the merge cost to read time, where it is bounded and parallelisable.

Read latency budget — target p99 < 100 ms

Step Estimated latency Notes
Network round trip ~5 ms Intra-region; cached by CDN edge for public timelines
Cache lookup (precomputed timeline list) ~2 ms Redis LRANGE on a warm key
Serialisation + hydration ~5 ms Tweet IDs → full tweet objects from object store
Celebrity merge (if any) ~20 ms One GET per followed celebrity — parallelised
Fast path total ~32 ms Well under the 100 ms budget
Cache miss fallback (DB reconstruction) 1–3 s Cold key only; should be <1% of reads

The real write-path bottleneck is fan-out queue lag. If the async worker falls behind, readers fall back to DB reconstruction for newer posts, spiking p99. Monitoring queue depth and consumer throughput is as important as monitoring cache hit rate.

🎯 Interview angle

Interviewers asking about Twitter architecture are testing whether you know to flip strategies at a threshold — not whether you can recite one pure approach. The three-part answer is: (1) name both strategies and their trade-offs; (2) state why neither survives a real social graph alone; (3) describe the hybrid cutoff and what happens at read time. Bonus: explain that the cutoff is tunable and mention the cache list length cap (e.g. 800 entries) as a memory bound. The cap matters — without it, a follower who never reads accumulates an unbounded list.

⚠️ Offset pagination on a live feed

GET /v1/timeline?offset=20 on a feed where new posts arrive constantly is a reliability hole. Between your first and second requests, 5 new posts arrive. Your offset=20 now skips items 16–20 from the original head — and you see duplicates from the tail that has now shifted. Cursor pagination encodes a stable position (the last tweet ID seen) that is unaffected by new inserts at the head. Always use cursors on timelines; reserve offsets for static, append-only, low-volume datasets.

✍️ Exercise: diagnose a slow timeline

Scenario: your timeline endpoint's p99 is 420 ms — well over the 100 ms target. Describe two distinct investigations you would run before blaming the database.

Model answer:

Rubric: a good answer names at least one cache metric and one write-path metric. Jumping to "the DB is slow" without checking cache hit rate first is the failure mode this exercise tests for.

Under the hood: the core mechanism

The timeline cache is a Redis list — an ordered sequence of tweet IDs, prepended on every fan-out write. The list is capped at a fixed length (e.g. 800 entries) by trimming older entries after each insert. Reads call LRANGE key 0 N-1 to get tweet IDs, then hydrate them from a separate tweet object store in one batch fetch.

Timeline cache structure

# Redis list per user, key: timeline:{user_id}
# Most-recent tweet ID is always at index 0 (head of list)
LPUSH timeline:usr_99  tweet_new           # prepend on fan-out
LTRIM timeline:usr_99  0 799               # cap at 800 entries

LRANGE timeline:usr_99 0 19                # read page 1 (first 20 IDs)

# Structure after several fan-out writes:
# index 0 → tweet_new  (most recent)
# index 1 → tweet_abc
# index 2 → tweet_xyz
# ...       (up to index 799, then trimmed)

Write-amplification worked example

A celebrity with 1 million followers posts a tweet. Under pure fan-out on write, here is exactly what happens — and why it cannot be done synchronously:

StepOperationCountTime @ 10 µs/write
1Insert tweet into author's post list1 write0.01 ms
2Fan-out worker reads follower list in pages of 10001 000 reads~10 ms
3LPUSH + LTRIM per follower timeline cache1 000 000 writes10 000 ms = 10 s
4Total cache writes for one celebrity post1 000 001≫ 5 s wall-clock (parallelised but still catastrophic)

Even distributing fan-out across 100 worker threads, each handling 10 000 followers, that is still 100 000 Redis writes per thread — a sustained write storm that monopolises the cache tier. The celebrity exemption converts this from 1 M writes to 1 write (into the celebrity's own post list), deferring the merge to read time.

Hybrid read-time merge — concrete trace

User usr_99 follows 300 normal accounts and 4 celebrities (each > 10 000 followers). A GET /v1/timeline request executes these steps in parallel:

# Step 1: fetch precomputed timeline (normal fan-out accounts only)
LRANGE timeline:usr_99 0 49          # returns up to 50 tweet IDs, ~2 ms

# Step 2: fetch recent posts from each celebrity (parallel, not sequential)
LRANGE posts:celeb_a   0 49          # celebrity A's own post list
LRANGE posts:celeb_b   0 49
LRANGE posts:celeb_c   0 49
LRANGE posts:celeb_d   0 49          # total ~20 ms for 4 parallel fetches

# Step 3: hydrate tweet IDs → full tweet objects (batch GET from object store)
# ~5 ms

# Step 4: merge + sort by created_at descending, return top 20
# merge of ≤250 items in memory: <0.1 ms

# Total: ~27 ms → well within 100 ms p99 budget
Precomputed cache LRANGE timeline:usr_99 celeb_a celeb_b celeb_c celeb_d celebrity post lists (parallel) Timeline Service (merge) Sorted response top 20 by created_at ~2 ms ~20 ms (parallel) total ~27 ms
At read time, the timeline service fetches the precomputed cache (normal accounts) and celebrity post lists in parallel, then merges and sorts. The precomputed path is ~2 ms; celebrity merges add ~20 ms — well inside the 100 ms p99 budget.

Operating & debugging it

In production, two metrics tell you everything about the health of the timeline system: cache hit rate and fan-out queue lag. A cache hit rate drop below 99 % means readers are falling through to expensive DB reconstruction. Queue lag above ~5 seconds means recent posts aren't appearing in timelines — users notice this as "tweets missing." Both problems look like slow timelines but have completely different root causes.

$ redis-cli INFO keyspace db0:keys=47382910,expires=44231888,avg_ttl=72400000 # Large gap between keys and expires? Many timelines have no TTL → unbounded memory growth. $ redis-cli DEBUG OBJECT timeline:usr_99 Value at: 0x7f... refcount:1 encoding:listpack serializedlength:412 lru:... type:list # encoding:listpack → list has ≤128 entries. encoding:quicklist → larger, expected for active users. $ redis-cli LLEN timeline:usr_99 800 # 800 = at the cap. LTRIM is running correctly. $ redis-cli LLEN timeline:usr_cold 0 # 0 = cold key. Next read will trigger DB reconstruction (expensive cold-read path).

To inspect the fan-out queue lag (assuming a Kafka-based fan-out worker):

$ kafka-consumer-groups.sh --describe --group fanout-workers --bootstrap-server kafka:9092 TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG tweet-events 0 8201934 8201940 6 tweet-events 1 7940211 7940218 7 # lag of 6–7 events = healthy. If lag grows to thousands, fan-out workers are falling behind.
SymptomLikely causeFix
Timeline p99 > 500 ms, cache hit rate dropsCache eviction under memory pressure; TTLs expiring; deployment flushed warm keysInspect Redis memory usage; increase memory or add replicas; warm caches on deploy
Recent tweets not appearing (lag 5–60 s)Fan-out worker queue depth growing; consumer group falling behindCheck queue consumer lag; scale fan-out workers; look for a hot partition blocking processing
Celebrity tweets never appearCelebrity's account not classified above threshold; post list key missing or emptyConfirm celebrity classification; verify LRANGE posts:celeb_X 0 9 returns entries; check timeline service is fetching celebrity lists at read time
Duplicate tweets in timelineFan-out worker processed same event twice (non-idempotent consumer)Make fan-out idempotent using tweet ID deduplication before LPUSH; check consumer group offset commits
Timeline always returns 0 items for new userNo precomputed cache exists yet; celebrity list fetch returns no resultsVerify follow graph was written; trigger a backfill; ensure timeline service falls back to DB reconstruction on empty key
Cursor pagination returns repeated or skipped tweetsCursor decodes to a tweet ID that was evicted from the list (trimmed beyond 800)Return empty results with has_more: false when the cursor position falls outside the list; document that deep pagination is not supported beyond the cache window
  1. Check cache hit rate first — a rate below 99 % almost always explains slow timelines before you investigate anything else.
  2. Check fan-out queue lag — a growing lag means fresh tweets are not reaching follower caches.
  3. Verify celebrity classification — use LLEN posts:celeb_X to confirm the post list exists and the timeline service is querying it.
  4. For memory growth, audit key TTLs: every timeline key should have a TTL (24 h); keys without TTLs from inactive users accumulate indefinitely.
  5. For duplicate tweets, trace a tweet ID through the fan-out worker log to confirm the idempotency check is running.

🧠 Quick check

1. A user with 80 000 followers posts a tweet. Under the hybrid strategy, what happens to their followers' precomputed timeline caches?

The hybrid cutoff (~10k followers) routes high-follower accounts away from write fan-out to avoid cache write storms. The tweet is stored once, in the celebrity's own post list, and merged into followers' timelines at read time.

2. Why does cursor pagination outperform offset pagination for a live timeline?

offset=5000 forces the DB to skip 5 000 rows each time. If a new tweet is inserted at the head between pages, all subsequent offsets shift — items appear twice or are skipped entirely. A cursor encodes a stable, insertion-independent position in the feed.

3. Eventual consistency is acceptable for home timeline reads because:

Unlike a payment ledger, a timeline lagging by a few seconds creates no financial or safety risk. The trade-off buys dramatically cheaper writes: async fan-out keeps write latency low and the fan-out worker decoupled from the request path.

4. What is the main risk if you pre-populate all followers' caches synchronously on every write, without a celebrity exemption?

Synchronous fan-out for a 50M-follower celebrity = 50M cache operations per tweet, in the critical path of the write request. This causes severe latency spikes and cache tier overload. The celebrity exemption converts this from a synchronous write storm to a deferred merge at read time.

Key takeaways

Sources & further reading