API Design

Production at Scale · Simulator 04

Queue & backpressure

A bounded queue between producers and a worker pool. Drag the producer rate above workers × per-worker throughput and watch the queue fill, consumer lag balloon, and backpressure begin dropping messages. This is the model behind Kafka consumer lag, SQS queue depth, and any async pipeline that can be overwhelmed.

InteractiveAnimatedModels rel-10

Line = queue depth over time, normalized to queue max. Red = queue at capacity (messages being dropped). Green = draining.

What's happening — the math

The queue grows whenever producers outpace consumers, and shrinks whenever consumers catch up. The key relationships:

# Consumer throughput (total msgs/s the worker pool can handle)
throughput   = workers × perWorker

# Net fill rate: positive means queue is growing
net          = producer − throughput

# Queue depth evolves each tick (clamped to [0, qmax])
depth(t+dt)  = clamp(depth(t) + net × dt, 0, qmax)

# Consumer lag: how far behind in time the consumers are
lag          = depth / max(1, throughput)   # seconds

# You need this many workers to keep up
min_workers  = ceil(producer / perWorker)

The fundamental rule: you need workers ≥ producer / perWorker to prevent unbounded queue growth. Below that threshold, the queue fills to capacity and backpressure (dropping or blocking) becomes unavoidable. Consumer lag is the leading indicator — it rises before you start dropping.

✅ Try this

1. Set producer to 10 k/s, 10 workers at 2 k/s each → throughput = 20 k/s → queue drains. 2. Raise producer to 30 k/s → net = +10 k/s → watch the queue fill and lag climb. 3. Add workers until throughput exceeds producer again — the queue drains. 4. Set qmax very small (e.g. 1 k) with a fast producer → backpressure starts almost immediately.

⚠️ Modeled, not measured

This is a first-principles model of a bounded queue, not a capture of any real message broker's behaviour. Real systems (Kafka, SQS, RabbitMQ) have rebalancing, partition assignment, fetch batching, and flow-control mechanisms that affect observed lag. The model shows the structural dynamics — treat numbers as illustrative.

Sources & further reading