API Design

Reliability & Scale · Lesson 02

Idempotency

Networks drop packets; clients time out; retries happen. Idempotency is the property that makes retrying safe — so the second (and third, and tenth) attempt produces the same result as the first, with no extra side effects piling up.

⏱ 13 min Difficulty: core Prereq: HTTP methods, REST basics

By the end you'll be able to

What idempotent means, exactly

The word comes from Latin: idem (same) + potens (power). In mathematics, a function is idempotent if applying it multiple times has the same effect as applying it once: f(f(x)) = f(x). In HTTP, an operation is idempotent if the server-side state ends up identical whether you sent the request once or a hundred times.

The classic light-switch analogy breaks down here because "toggle" is not idempotent — each press changes state. A better analogy is a thermostat set-point: telling your thermostat "set temperature to 22°C" ten times in a row ends in exactly the same state as telling it once. The action is about declaring a desired state, not incrementing a counter.

Which HTTP methods are idempotent?

The HTTP specification (RFC 9110) is explicit about this:

MethodIdempotent?Safe (read-only)?Reasoning
GETYesYesRead-only; same response every time (assuming no server state changed between calls)
HEADYesYesLike GET, no body, no mutation
OPTIONSYesYesDescribes capabilities; no mutation
PUTYesNoReplaces a resource entirely; sending the same body twice yields the same resource
DELETEYesNoFirst call deletes; subsequent calls find nothing to delete (404) but the state is identical — the resource is gone
POSTNoNoCreates a new resource or triggers a side effect; sending twice creates two resources
PATCHNot by defaultNoCan be designed idempotently (set fields) or not (add 5 to a counter); intent-dependent

The non-idempotency of POST is not a design flaw — it is the point. POST /orders is supposed to create a new order each time. The problem arises when a network failure leaves the client uncertain whether the request arrived: retrying may create a duplicate. That's the gap idempotency keys fill.

Why retries demand idempotency

Distributed systems have exactly two outcomes after sending a request: the server processed it, or it didn't. But the client often cannot tell which. A TCP timeout can happen after the server committed the transaction but before the response bytes arrived. The client sees only silence. Its two choices are:

  1. Give up — safe, but now the user's payment or order may be silently lost.
  2. Retry — recovers from transient failures, but risks a duplicate if the first attempt succeeded.

Neither choice is acceptable on its own. Idempotency keys resolve the dilemma: retry freely, and let the server figure out whether this is genuinely new work or a duplicate it already handled.

⚠️ Common trap

The double-charge. A user clicks "Pay" — the server processes the payment and charges the card, but the response is lost in transit. The client shows an error spinner. The user clicks "Pay" again. Without idempotency: the card is charged twice. This is not a theoretical edge case — it happens in production under normal network jitter, especially on mobile clients. Any endpoint that moves money, sends an email, or creates a real-world side effect needs to be idempotent.

Idempotency keys: how they work

The pattern is straightforward:

  1. The client generates a unique key (a UUID v4 is typical) for this logical operation and attaches it to every attempt of the same request.
  2. The server stores the key in a deduplication store (Redis, a DB table) the first time it sees it, along with the operation's final response.
  3. On a retry, the server looks up the key, finds the stored response, and returns it directly — no second execution.
  4. The key expires after a reasonable window (24 hours for payments is common), after which a new request with the same key is treated as a new operation.
# First attempt — client generates a stable UUID for this charge
POST /v1/charges HTTP/1.1
Host: api.payments.example
Authorization: Bearer sk_live_abc123
Idempotency-Key: a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c
Content-Type: application/json

{
  "amount": 4999,
  "currency": "usd",
  "source": "tok_visa",
  "description": "Order #8812"
}

# Server response — charge created, stored against the key
HTTP/1.1 201 Created
Content-Type: application/json

{
  "id": "ch_3Pz7kL2eZvKYlo2C1j8mR4nQ",
  "amount": 4999,
  "status": "succeeded"
}

Now suppose the response never reached the client. On retry:

POST /v1/charges # same body, same Idempotency-Key: a3f8c2d1-… → Server: key found in dedup store HTTP/1.1 200 OK # note: 200 not 201 — this is a replay, not a new creation {"id":"ch_3Pz7kL2eZvKYlo2C1j8mR4nQ","amount":4999,"status":"succeeded"} → Card charged: once. Client sees: success. State: correct.
Client attempt 1 ↳ timeout attempt 2 (retry) Dedup Store key: a3f8c2d1-… → ch_3Pz7… ✓ stored attempt 2: key found → replay no execution 1st only Handler charges card once cached response returned
Attempt 1 is processed and its response stored under the key. Attempt 2 (the retry) finds the key in the dedup store and receives the stored response — the handler and card processor are never called again.

At-least-once delivery and why it pairs with idempotency

Distributed messaging systems (queues, webhooks, event buses) guarantee at-least-once delivery: a message will eventually reach a consumer, possibly more than once. "Exactly-once" is mathematically hard to guarantee end-to-end across failures. The practical solution is to design consumers to be idempotent — receiving a duplicate message a second time changes nothing. Together: at-least-once delivery + idempotent consumers = effectively-once semantics from the perspective of state.

✅ Do this, not that

Do use a UUID v4 as the idempotency key generated on the client side, scoped to a single logical operation ("this specific checkout attempt"). Don't reuse the same key for different operations just to save code — if the body differs but the key matches, the server will replay the old result and silently ignore the new request. Some servers validate that the key maps to an identical payload and return a conflict error if it doesn't; build that check in.

🎯 Interview angle

"Design an idempotent create endpoint for a payments API." Walk through: (1) client generates a stable UUID and attaches it as Idempotency-Key; (2) server writes the key + pending status to a dedup table inside the same DB transaction as the payment; (3) on retry, the server reads the stored result and returns it, 200 not 201; (4) the key expires after 24 hours. Interviewers listen for "same transaction" or "atomic write" — that's the detail that prevents a race condition where two concurrent retries both pass the key-not-found check simultaneously. Mentioning the at-least-once delivery context elevates the answer to distributed-systems level.

Under the hood: how an idempotency-key store actually works

The phrase "store the key and return the response on retry" hides several important implementation details. Here is the exact mechanism, the state the store carries, and the race conditions you must handle.

The atomic insert-if-absent

The critical primitive is insert the key only if it does not already exist, atomically. In Redis this is SET key value NX (NX = "only if Not eXists"). In a SQL database it is an INSERT … ON CONFLICT DO NOTHING inside a transaction, or a unique constraint on the key column.

# Redis — atomic insert-if-absent
# Returns OK if the key was new, nil if it already existed
SET ik:a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c "pending" NX EX 86400
#                                                 ^^        ^^ TTL: 24 h
#   NX = only set if not exists

If two concurrent retries both reach Redis at the same instant, exactly one will receive OK and the other will receive nil — Redis is single-threaded for command execution, so there is no ambiguity. The winner proceeds to execute the business logic; the loser waits (see below).

State table: what the store holds

Key stateMeaningAction on new request
AbsentFirst time this key has been seen (or TTL expired)SET NX → "pending"; proceed to execute business logic
"pending"A concurrent request is already executing under this keyReturn 409 Conflict or spin-wait briefly, then retry the lookup; the in-flight request will overwrite "pending" with the final response
{response_json}Business logic completed; stored response is the canonical resultReturn the stored response directly — no execution
Absent (expired TTL)Key is past the retention windowTreat as a brand-new first attempt

Concrete flow traced end-to-end

── Client generates key ──────────────────────────────────────────── key = "a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c" ── Attempt 1 ─────────────────────────────────────────────────────── POST /v1/charges Idempotency-Key: a3f8c2d1-… → Redis SET ik:a3f8c2d1-… "pending" NX EX 86400 → OK (new key) → Execute: charge card, write order row to DB → Build response: {"id":"ch_3Pz7…","status":"succeeded"} → Redis SET ik:a3f8c2d1-… '{"status":201,"body":{...}}' EX 86400 ← 201 Created {"id":"ch_3Pz7…","status":"succeeded"} [response lost in transit — client sees timeout] ── Attempt 2 (retry, same key) ───────────────────────────────────── POST /v1/charges Idempotency-Key: a3f8c2d1-… → Redis GET ik:a3f8c2d1-… → '{"status":201,"body":{...}}' → Key found and complete — skip execution ← 200 OK {"id":"ch_3Pz7…","status":"succeeded"} # 200 not 201: replay Card charged: once. State: correct.

Handling a concurrent in-flight duplicate

The tricky case: two retries arrive simultaneously, neither has stored a final response yet, and the key holds "pending". Options:

  1. Return 409 immediately — "a request with this key is currently in-flight; wait and retry." Simple, but requires the client to handle 409.
  2. Short poll — wait 50–200 ms, re-check the key; if it transitions from "pending" to a final response within a short window, return it. Best for latency-sensitive clients.
  3. Use a distributed lock — acquire a lock on the key before executing; the second concurrent arrival blocks until the lock is released. More complex but avoids the 409 round-trip.

Key TTL and expiry policy

A 24-hour TTL is the Stripe standard for payment idempotency keys. The tradeoff: too short and a legitimate slow retry (hours later, due to a batch job) creates a duplicate; too long and the deduplication table grows without bound. Choose the TTL to match your retry window — the maximum time a client could reasonably retry the same operation.

-- SQL alternative: dedup table with partial index for active keys
CREATE TABLE idempotency_keys (
  key          TEXT        PRIMARY KEY,
  created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  status       TEXT        NOT NULL,   -- 'pending' | 'complete'
  response     JSONB                     -- null while pending
);
-- Insert atomically inside the same DB transaction as the business record
BEGIN;
  INSERT INTO idempotency_keys(key, status)
    VALUES('a3f8c2d1-…', 'pending')
    ON CONFLICT(key) DO NOTHING;
  -- IF 0 rows inserted, key existed → roll back, return stored response
  INSERT INTO charges(...) VALUES(...);
COMMIT;

Inserting the idempotency key and the business record in the same transaction means they either both commit or both roll back. This prevents the failure mode where the business record is written but the key write fails, leaving no deduplication guard for retries.

How to debug & inspect it

Idempotency bugs usually manifest as either duplicate side effects (a key was not stored correctly, so retries re-execute) or phantom 409s (a key is permanently stuck in "pending" due to a crash mid-execution). Both are diagnosable by replaying requests and reading the dedup store directly.

Test by replaying the same key

$ KEY="a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c" ## First request $ curl -s -X POST https://api.example.com/v1/charges \ -H "Idempotency-Key: $KEY" \ -H "Content-Type: application/json" \ -d '{"amount":4999,"currency":"usd"}' | jq .id "ch_3Pz7kL2eZvKYlo2C1j8mR4nQ" ## Immediate replay — must return the same charge id, not a new one $ curl -s -X POST https://api.example.com/v1/charges \ -H "Idempotency-Key: $KEY" \ -H "Content-Type: application/json" \ -d '{"amount":4999,"currency":"usd"}' | jq .id "ch_3Pz7kL2eZvKYlo2C1j8mR4nQ" # same id = idempotency working "ch_9Xab2mNqJwPcVo5D0r6sF8kT" # different id = double-charge bug ## Inspect the Redis key directly (dev/staging only) $ redis-cli GET "ik:a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c" {"status":201,"body":{"id":"ch_3Pz7kL2eZvKYlo2C1j8mR4nQ","status":"succeeded"}} # nil here means the key was not stored — execution happened but dedup write failed $ redis-cli TTL "ik:a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c" 82453 # seconds remaining; confirm TTL is ~86400 (24 h)

Symptom → cause → fix

SymptomLikely causeFix
Double charge despite same Idempotency-Key on both requestsKey was not written to the dedup store after first execution (Redis write failed, or code path skipped it)Ensure the key is written atomically with the business record; add an integration test that replays the same key and asserts no new charge is created
409 Conflict on every retry, even long after the original requestKey is stuck in "pending" — the original handler crashed before updating the status to "complete"Add a background job that marks keys older than N seconds as "failed" and stores an error response; clients can then retry with a new key
Key collision — different users share the same idempotency key valueThe key was not scoped to (client_id, key) — a global key namespace lets two clients collideStore and look up keys as (user_id, key) compound — never a raw global key
Stored error response is returned when client expects a success on retryFirst attempt failed (e.g. card declined); that error was stored; retries replay the stored errorThis is correct behaviour — a stored error is still the canonical result. Document it: clients must generate a new key to reattempt after a business-logic failure, not a network failure
200 on replay but response body differs from original 201Stored response was truncated or serialised differentlyStore the full serialised response body verbatim at the time of creation; deserialise it exactly on replay
  1. In staging, always run a replay test after any change to the idempotency path: send the same key twice and assert the side-effect count (charges, rows) is exactly one.
  2. In production logs, correlate Idempotency-Key with the charge/order ID to confirm unique keys produce unique resources.
  3. Monitor the 409 rate on idempotency-key endpoints — a spike indicates a stuck-pending problem or a client bug re-using keys.
  4. Check the dedup store TTL on every key write; a missing EXPIRE means keys accumulate forever.

In production: how leading APIs do it

Money-handling APIs converged on a remarkably consistent design for idempotency — not by coordination, but because every team that built payment infrastructure at scale hit the same failure modes and arrived at the same solution independently.

CompanyHeaderBehaviour
Stripe Idempotency-Key Client supplies a UUID on any POST (charges, payment intents, subscriptions). The server stores the operation result and replays the identical response for up to 24 hours. If the same key is sent with a different request body, Stripe returns a 422 to surface the likely programming error. Stripe's documentation explicitly states keys must be unique per logical operation and must not be reused across different operations.
PayPal PayPal-Request-Id Client-generated ID attached to order creation, capture, and refund requests. PayPal deduplicates within a 72-hour window. The documentation calls it a "request ID" rather than an "idempotency key," but the mechanism is identical: server stores the response keyed by the header value and replays it on retry.
Adyen Idempotency-Key Required on payment initiation and modification endpoints. Adyen's documentation specifies a maximum key length and recommends a UUID format. The deduplication window varies by endpoint type. Like Stripe, Adyen returns an error if the same key is reused with a different payload.
Square idempotency_key (request body field) Square embeds the idempotency key in the JSON request body rather than a header — functionally equivalent, different placement. Used on payment creation, order creation, and refund endpoints. Square's documentation notes keys are retained for at least 48 hours.

Why this exact design is the industry standard. Every payment API above independently arrived at the same three-part design: a client-supplied key, a server-side deduplicated result store, and a fixed expiry window. The convergence is not accidental — it reflects the hard constraints of the problem.

First, the key must be client-supplied. The server cannot generate a deduplication key because the key must be stable across network failures — the client must be able to resend the exact same key on retry, before it knows whether the first attempt succeeded. A server-generated key would require a successful round-trip to obtain, which is exactly the situation idempotency is designed to handle.

Second, the key must be stored with the result, not just the fact of execution. Storing only "this key was processed" and then re-executing to build the response on replay would introduce a second execution window. The stored response must be verbatim so that replays are indistinguishable from the original — same charge ID, same status, same body.

Third, the expiry window (24–72 hours across providers) reflects the practical retry window for automated systems. A batch job running daily might retry a failed payment up to 24 hours later; a retry window shorter than that would treat a legitimate late retry as a new operation and risk a duplicate. A window significantly longer than the maximum realistic retry interval wastes storage on keys that will never be replayed.

The result: client-supplied UUID + server-side result store + expiry window is the minimal correct design for safe retries on non-idempotent endpoints. Every variation across providers (header vs. body field, 24h vs. 72h, response on key conflict) is a parameter choice, not a structural difference.

How leading APIs do it

🧠 Quick check

1. An operation is idempotent when:

Idempotency is about server-side state, not about response codes or call limits. The critical test is: does the second call produce any additional effect beyond what the first call already produced?

2. Which HTTP method is idempotent according to RFC 9110?

DELETE is idempotent: the first call removes the resource; subsequent calls find it already gone, but the resulting state (resource absent) is the same. POST creates a new resource on each call — not idempotent. PATCH's idempotency depends on the operation semantics.

3. A client POSTs a charge, receives a network timeout, and retries with the same Idempotency-Key. The server should:

The whole point of the key is to enable the server to return the stored result of the original operation rather than running it again. A 409 would be correct only if the same key was sent with a different payload — a likely programming error worth surfacing.

4. "At-least-once delivery" means a messaging system guarantees:

At-least-once delivery trades duplicate messages for delivery guarantees. It requires consumers to be idempotent to avoid processing the same event twice. "Exactly-once" requires expensive distributed coordination that is difficult to guarantee across all failure modes.

✍️ Exercise: design an idempotent subscription creation endpoint (try before opening)

You're building POST /v1/subscriptions for a SaaS billing API. Users must not be double-subscribed if a retry occurs. Describe the full flow from client to database, including the deduplication mechanism, race condition prevention, and expiry policy. You may use pseudo-code or prose.

Model answer:

  1. Client side: Generate a UUID v4 before the first attempt: key = uuid4(). Send it as Idempotency-Key: <key> on every retry of this specific subscription creation. Never reuse the key for a different plan or user.
  2. Server — receive request: Extract the key and query the dedup table: SELECT * FROM idempotency_keys WHERE key = ? AND created_at > NOW() - INTERVAL '24h'.
  3. Key found (retry path): Return the stored response body and status code immediately. Do not execute any business logic. Return 200, not 201, to signal this is a replay.
  4. Key not found (first-time path): Begin a database transaction. Insert the key with status 'pending' into the dedup table inside the same transaction as the subscription record. This prevents a race where two concurrent retries both pass the "key not found" check. Commit. Update the stored response to 'complete' with the full response body.
  5. Expiry: Keys older than 24 hours are treated as expired (a new request with the same key is a fresh operation). A nightly background job purges expired rows.
-- Pseudo-schema for the dedup table
CREATE TABLE idempotency_keys (
  key          TEXT PRIMARY KEY,
  created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  status       TEXT,        -- 'pending' | 'complete'
  response     JSONB        -- stored full response
);

Rubric: ✓ UUID v4 on client ✓ dedup table lookup before execution ✓ atomic insert (same transaction as business record) ✓ 200 vs 201 distinction for replay ✓ expiry policy defined. Five out of five = senior-level answer; missing the "same transaction" point drops to mid-level.

Key takeaways

Sources & further reading