Design Case Studies · Lesson 04
Design: Pub/Sub API
A publish/subscribe system lets producers fire events without knowing who's listening — but the moment you promise delivery guarantees, fan-out, and ordered processing at scale, every easy decision becomes a trade-off.
By the end you'll be able to
- Define the topic/subscription model and explain fan-out semantics.
- Choose between webhook push and streaming pull delivery for a given subscriber profile.
- Explain at-least-once delivery, why it forces idempotent consumers, and how per-key ordering and dead-letter queues complete the picture.
1 · Requirements
An engineering team comes to you with this brief: "We need a messaging layer so services can broadcast events to whoever cares, without the sender knowing the recipients." Before touching a single endpoint, translate that into concrete requirements.
| Requirement | What it implies |
|---|---|
| Named topics | Logical channels — "order.placed", "payment.failed" |
| Subscriptions | Each subscriber registers interest in one topic |
| Fan-out | One message → N independent subscribers, each gets their own copy |
| Delivery guarantees | At-least-once (vs at-most-once vs exactly-once) |
| Scale | High-throughput publish, many parallel subscribers |
| Ordering | Per-key ordering (e.g. all events for the same order arrive in sequence) |
| Failure handling | Retries + dead-letter queue for unprocessable messages |
Notice what is not a requirement: exactly-once delivery. That is extremely hard to guarantee at the infrastructure layer without sacrificing throughput. Practical systems instead guarantee at-least-once and ask consumers to be idempotent — the far cheaper trade-off.
In an interview, state requirements out loud and get agreement before naming endpoints. A candidate who asks "do we need global ordering or per-key ordering?" shows they understand the cost difference — global ordering requires a single partition and kills throughput.
2 · Design decisions
2a. Topic and subscription model
Think of a topic as a radio station — it just broadcasts. A subscription is a satellite dish pointed at that station. The broadcaster (producer) never knows how many dishes are listening, and each dish receives an independent copy of every broadcast.
2b. Delivery mode: webhook push vs streaming pull
How does the system get messages to subscribers? Two broad strategies:
| Strategy | How it works | Best for | Risk |
|---|---|---|---|
| Webhook push | System POSTs each message to the subscriber's HTTPS endpoint | Low-volume, real-time reactions (billing, notifications) | Subscriber downtime causes retry storms; requires public endpoint |
| Streaming pull | Subscriber opens a long-lived connection (SSE/gRPC stream) and messages are streamed down | High-throughput consumers inside a datacenter | Connection management complexity; back-pressure must be handled |
This design supports both modes. When creating a subscription the caller chooses. See Debugging Webhooks (dbg-04) for the full webhook failure taxonomy.
2c. At-least-once delivery and idempotent consumers
Imagine you're mailing a parcel and the post office promises: "We'll keep re-sending until you confirm receipt." That's at-least-once delivery — the consumer might get the same parcel twice if the first acknowledgement was lost in transit. The parcel itself is fine; the consumer just has to be smart enough not to unbox it twice.
At-least-once is the practical default because it requires no distributed coordination. The alternative, exactly-once, demands two-phase commit semantics across producer, broker, and consumer — expensive latency overhead. See Idempotency (rel-02) for how to build idempotent consumers, and Event-driven & Pub/Sub (rel-10) for broker-level patterns.
2d. Per-key ordering
Consider a shopping cart: the sequence "add item → remove item → checkout" must be processed in that order. Per-key ordering (routing all events for the same cart ID to the same partition) achieves this while still allowing massive parallelism — all other cart IDs proceed independently. Global ordering forces a single partition and is almost never worth the throughput cost.
2e. Retries and dead-letter queue
When a subscriber's webhook returns a 5xx or times out, the message must be retried — but naively retrying immediately will hammer a struggling service. The right pattern is exponential back-off with jitter. After N failed attempts, the message is moved to a dead-letter queue (DLQ): a separate topic that operators can inspect, replay, or discard without losing the original message. See Event-driven & Pub/Sub (rel-10) for detailed DLQ design.
3 · The API model
Core endpoints
# 1. Create a topic
POST /v1/topics
# Request
{
"name": "order.placed",
"retention_hours": 168,
"ordering_key": "orderId"
}
# Response 201
{
"id": "top_8xKq",
"name": "order.placed",
"created_at": "2026-06-20T09:00:00Z"
}
# 2. Publish a message
POST /v1/topics/top_8xKq/messages
# Request
{
"data": {
"orderId": "ord_123",
"total": 49.95,
"currency": "USD"
},
"ordering_key": "ord_123",
"idempotency_key": "pub_abc_20260620T090000"
}
# Response 202 — accepted for fan-out
{
"message_id": "msg_7tNv",
"status": "accepted"
}
# 3. Create a webhook subscription
POST /v1/subscriptions
# Request
{
"topic_id": "top_8xKq",
"name": "billing-handler",
"delivery": {
"type": "webhook",
"url": "https://billing.internal/hooks/orders",
"signing_secret": "whsec_..."
},
"retry_policy": {
"max_attempts": 5,
"backoff": "exponential",
"initial_interval_ms": 1000
},
"dead_letter_topic_id": "top_dlq_billing"
}
# Response 201
{
"id": "sub_4pLm",
"status": "active"
}
# 4. Signed webhook delivery (system → subscriber)
POST https://billing.internal/hooks/orders # called by the pub/sub system
X-PubSub-Signature: sha256=<hmac>
X-PubSub-Message-Id: msg_7tNv
X-PubSub-Attempt: 1
Content-Type: application/json
{
"topic": "order.placed",
"message_id": "msg_7tNv",
"ordering_key": "ord_123",
"published_at": "2026-06-20T09:00:01Z",
"data": { "orderId": "ord_123", "total": 49.95 }
}
# Subscriber must respond 200/204 to acknowledge.
# Any other response or timeout triggers retry with back-off.
# 5. Streaming pull subscription (SSE)
GET /v1/subscriptions/sub_4pLm/messages?max_messages=10
Accept: text/event-stream
# Server streams events:
data: {"message_id":"msg_7tNv","data":{...},"ack_token":"ack_zZp9"}
# Consumer must acknowledge:
POST /v1/subscriptions/sub_4pLm/acknowledge
{ "ack_tokens": ["ack_zZp9"] }
# → 204 No Content
4 · Evaluation & latency budget
Delivery guarantees
At-least-once is the correct default. The table below shows why exactly-once is expensive:
| Guarantee | Mechanism | Typical overhead |
|---|---|---|
| At-most-once | Fire and forget — no retry on failure | Lowest; risk of message loss |
| At-least-once | Retry until ack; consumer must be idempotent | Low; ~10–50 ms broker overhead |
| Exactly-once | 2PC or transactional outbox; idempotent broker | High; 2–10× overhead; complex failure modes |
Consumer idempotency
The message_id field is the natural idempotency key. Consumers should record "I have processed msg_7tNv" in their own datastore (a deduplicated events table works well) and skip re-processing. See Idempotency (rel-02) for the full implementation pattern.
Webhook latency budget
| Segment | Budget | Note |
|---|---|---|
| Publish to broker | < 5 ms p99 | Async enqueue; durably written before 202 returned |
| Broker fan-out to subscription queue | < 10 ms p99 | In-memory copy per subscription |
| Webhook HTTP call (happy path) | < 200 ms | Subscriber SLA; system times out at 30 s |
| Total e2e (producer → subscriber ack) | < 300 ms p95 | Excludes retry attempts |
Interviewers love the question: "how do you guarantee exactly-once?" The right answer is "I don't — I guarantee at-least-once delivery and make consumers idempotent, because the cost of exactly-once across a distributed system is rarely worth it." Then explain the message_id deduplication pattern. That answer distinguishes you from candidates who say "just use Kafka exactly-once semantics" without understanding the overhead.
Forgetting that fan-out means independent delivery state per subscription. A slow subscriber should not block a fast one. If you model this as a single queue all subscribers read from, you've lost fan-out semantics — one stuck reader holds the cursor and starves the others. Each subscription needs its own cursor or queue.
✍️ Exercise: design the DLQ replay endpoint
An operator wants to replay all messages from the dead-letter queue for subscription sub_4pLm that failed between 2026-06-01 and 2026-06-20. Design the endpoint — HTTP method, path, request body, response, and the semantic questions you need to answer before implementing.
Model answer:
POST /v1/subscriptions/sub_4pLm/dlq/replay
{
"from": "2026-06-01T00:00:00Z",
"to": "2026-06-20T23:59:59Z",
"destination_subscription_id": "sub_4pLm" # replay to same or different sub
}
# Response 202 — async, potentially thousands of messages
{
"replay_id": "rpl_9Xk2",
"message_count": 47,
"status": "in_progress"
}
Semantic questions to answer before implementing: (1) Should replayed messages get a new message_id or preserve the original? (Original is better for idempotency.) (2) Should replay re-use the existing retry policy, or succeed/fail silently? (3) Does replaying clear the DLQ entry on success?
Rubric: ✓ async 202 (never replay synchronously — too slow) ✓ preserves original message_id for idempotency ✓ time-range filter ✓ raises semantic questions about DLQ clearing and retry policy.
Under the hood: the core mechanism
A pub/sub broker is, at its core, a partitioned append-only log. Understanding what this means at the data structure level explains every delivery guarantee and ordering property in the system.
The topic as a partitioned log
A topic is divided into one or more partitions. Each partition is a durable append-only sequence of messages numbered from 0. Publishing a message appends it to the tail of one partition — the one selected by hashing the message's ordering_key. The sequence number within a partition is called the offset.
Partition assignment by ordering key
When a producer publishes with "ordering_key": "ord_123", the broker computes partition = hash(ordering_key) mod num_partitions. All messages for ord_123 land in the same partition, so they are appended and delivered in order. Messages for ord_456 may land in a different partition and proceed independently — that is how per-key ordering coexists with parallel throughput.
Worked offset / redelivery trace
Follow one message through the complete lifecycle — publish, fan-out, ack, and at-least-once redelivery:
# T=0: Producer publishes
POST /v1/topics/top_8xKq/messages
{
"data": { "orderId": "ord_123" },
"ordering_key": "ord_123"
}
broker receives → hash("ord_123") % 4 = partition_2
append to partition_2: offset 7 # now the durable tail
→ 202 { "message_id": "msg_7tNv" }
# T=1ms: Broker fans out to each subscription's cursor queue
# Sub A (billing): P2 offset was 7 → enqueue delivery job for offset 7
# Sub B (analytics): P2 offset was 4 → Sub B is behind; msg_7tNv will be
# delivered when Sub B reaches offset 7
# T=10ms: Worker delivers to Sub A via webhook
POST https://billing.internal/hooks/orders
X-PubSub-Message-Id: msg_7tNv
{ "data": { "orderId": "ord_123" } }
# Case A — subscriber returns 200 OK:
broker receives ack → advance Sub A cursor: P2 offset 7 → 8
# msg_7tNv is never redelivered to Sub A
# Case B — subscriber returns 500 / times out:
broker does NOT advance cursor (offset stays at 7)
retry attempt 2 after 1 s (initial_interval_ms=1000)
retry attempt 3 after 2 s (exponential back-off, base=2)
retry attempt 4 after 4 s
retry attempt 5 after 8 s
# attempt 5 still fails → message_id msg_7tNv moved to DLQ topic
# Sub A cursor advances past offset 7 (message quarantined, not lost)
# Sub A continues delivering offset 8, 9, … unblocked
# Sub B is independent: still at offset 4, will eventually reach 7
# and make its own delivery attempt for msg_7tNv
The key insight: the broker never deletes the message until the retention window expires. An offset is just a pointer. Redelivery means re-reading the same log position. The subscriber's acknowledgement is what advances the pointer — not the delivery itself.
If a consumer reads a message, processes it (e.g. inserts a row in the database), and then the process crashes before it sends the ack, the broker re-delivers the same message. The consumer sees it again and processes it twice. The fix is idempotent processing: use message_id as a deduplication key and record "I have processed msg_7tNv" atomically alongside the business mutation — not in a separate step afterward. See Idempotency (rel-02).
Operating & debugging it
Most pub/sub problems are visible in two signals: subscription lag (the gap between the latest offset and the subscription's committed offset) and DLQ depth (messages that exhausted all retries). Both should be monitored as gauges with alerts.
Inspect a subscription's health
Symptom → cause → fix
| Symptom | Likely cause | Fix |
|---|---|---|
| Subscription lag grows monotonically | Webhook endpoint is down, returning 5xx, or timing out consistently | Fix the subscriber endpoint; check /dlq for the last error; the lag will drain once the endpoint recovers (messages are still in the log) |
| DLQ depth keeps growing | Consumer has a persistent bug (parse error, missing field) that never succeeds regardless of retries | Inspect a DLQ message's payload and last_error; fix the consumer; replay the DLQ after fix |
| Consumer processes the same message twice | Consumer crashed between processing and ack, or webhook returned 200 but the broker didn't receive it (network drop) | Make the consumer idempotent using message_id as a dedup key |
| Messages for one ordering key arrive out of order | Consumer is pulling from multiple partitions concurrently without respecting partition boundaries, or a DLQ replay bypassed ordering | Ensure all messages sharing an ordering key are on the same partition; replay DLQ in offset order |
| Fast subscriber starved — other slow subscriber appears to block fan-out | Architecture error: subscriptions share a single queue/cursor instead of independent per-subscription state | Each subscription must maintain its own offset; a slow Sub B cannot hold back Sub A's offset |
| Webhook delivery rate collapses after restart | Retry storm: a backlog of retries from the offline period all fire simultaneously at restart | Use exponential back-off with jitter on retry scheduling; add a max-concurrency limit per subscription on the delivery workers |
- Check subscription lag per partition — a lag spike on a single partition points to an ordering-key group, not the whole subscription.
- Pull the most recent entry from the DLQ; read
last_error— connection refused vs 5xx vs 200-but-no-ack are three different problems. - Verify the webhook endpoint manually:
curl -X POST <webhook_url>with a sample payload and the signing header. - Check that your consumer code commits the ack after processing completes, not before.
- After fixing the subscriber, use the DLQ replay endpoint to reprocess quarantined messages — preserve the original
message_idfor idempotent consumers.
🧠 Quick check
1. A subscriber processes the same event twice. Given pub/sub is typically at-least-once delivery, the correct fix is:
At-least-once means duplicates are expected. You can't make the network perfect, so the consumer must be idempotent — dedupe on the event id. (See the idempotency lesson.)
2. Why deliver events via webhook with retries and a dead-letter queue, instead of assuming a single delivery succeeds?
Subscribers go down and networks drop requests. Retries with backoff recover transient failures; a dead-letter queue captures what still can't be delivered for later inspection.
3. How does a pub/sub topic differ from a point-to-point queue?
A queue delivers each message to one consumer; a topic fans the same message out to all subscribers — the core of decoupled, one-to-many event delivery.
Key takeaways
- Topic = channel, subscription = independent cursor. Fan-out means each subscription gets its own copy; one slow subscriber cannot block another.
- Choose at-least-once delivery and design consumers to be idempotent — cheaper than exactly-once and easier to reason about.
- Support both webhook push (real-time, public endpoint) and streaming pull (high-throughput, internal); let callers choose at subscription creation.
- Per-key ordering gives ordered delivery for related events without sacrificing overall throughput to a single partition.
- Exponential back-off + dead-letter queue is the industry-standard failure handling pair — never retry immediately, never silently discard.