Reliability & Scale · Lesson 02
Idempotency
Networks drop packets; clients time out; retries happen. Idempotency is the property that makes retrying safe — so the second (and third, and tenth) attempt produces the same result as the first, with no extra side effects piling up.
By the end you'll be able to
- Define idempotency precisely and list which HTTP methods are idempotent by specification.
- Explain why POST is not idempotent and how idempotency keys restore safe-retry semantics for it.
- Design an idempotent create endpoint end-to-end, including the server deduplication and response-storage strategy.
What idempotent means, exactly
The word comes from Latin: idem (same) + potens (power). In mathematics, a function is idempotent if applying it multiple times has the same effect as applying it once: f(f(x)) = f(x). In HTTP, an operation is idempotent if the server-side state ends up identical whether you sent the request once or a hundred times.
The classic light-switch analogy breaks down here because "toggle" is not idempotent — each press changes state. A better analogy is a thermostat set-point: telling your thermostat "set temperature to 22°C" ten times in a row ends in exactly the same state as telling it once. The action is about declaring a desired state, not incrementing a counter.
Which HTTP methods are idempotent?
The HTTP specification (RFC 9110) is explicit about this:
| Method | Idempotent? | Safe (read-only)? | Reasoning |
|---|---|---|---|
GET | Yes | Yes | Read-only; same response every time (assuming no server state changed between calls) |
HEAD | Yes | Yes | Like GET, no body, no mutation |
OPTIONS | Yes | Yes | Describes capabilities; no mutation |
PUT | Yes | No | Replaces a resource entirely; sending the same body twice yields the same resource |
DELETE | Yes | No | First call deletes; subsequent calls find nothing to delete (404) but the state is identical — the resource is gone |
POST | No | No | Creates a new resource or triggers a side effect; sending twice creates two resources |
PATCH | Not by default | No | Can be designed idempotently (set fields) or not (add 5 to a counter); intent-dependent |
The non-idempotency of POST is not a design flaw — it is the point. POST /orders is supposed to create a new order each time. The problem arises when a network failure leaves the client uncertain whether the request arrived: retrying may create a duplicate. That's the gap idempotency keys fill.
Why retries demand idempotency
Distributed systems have exactly two outcomes after sending a request: the server processed it, or it didn't. But the client often cannot tell which. A TCP timeout can happen after the server committed the transaction but before the response bytes arrived. The client sees only silence. Its two choices are:
- Give up — safe, but now the user's payment or order may be silently lost.
- Retry — recovers from transient failures, but risks a duplicate if the first attempt succeeded.
Neither choice is acceptable on its own. Idempotency keys resolve the dilemma: retry freely, and let the server figure out whether this is genuinely new work or a duplicate it already handled.
The double-charge. A user clicks "Pay" — the server processes the payment and charges the card, but the response is lost in transit. The client shows an error spinner. The user clicks "Pay" again. Without idempotency: the card is charged twice. This is not a theoretical edge case — it happens in production under normal network jitter, especially on mobile clients. Any endpoint that moves money, sends an email, or creates a real-world side effect needs to be idempotent.
Idempotency keys: how they work
The pattern is straightforward:
- The client generates a unique key (a UUID v4 is typical) for this logical operation and attaches it to every attempt of the same request.
- The server stores the key in a deduplication store (Redis, a DB table) the first time it sees it, along with the operation's final response.
- On a retry, the server looks up the key, finds the stored response, and returns it directly — no second execution.
- The key expires after a reasonable window (24 hours for payments is common), after which a new request with the same key is treated as a new operation.
# First attempt — client generates a stable UUID for this charge
POST /v1/charges HTTP/1.1
Host: api.payments.example
Authorization: Bearer sk_live_abc123
Idempotency-Key: a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c
Content-Type: application/json
{
"amount": 4999,
"currency": "usd",
"source": "tok_visa",
"description": "Order #8812"
}
# Server response — charge created, stored against the key
HTTP/1.1 201 Created
Content-Type: application/json
{
"id": "ch_3Pz7kL2eZvKYlo2C1j8mR4nQ",
"amount": 4999,
"status": "succeeded"
}
Now suppose the response never reached the client. On retry:
At-least-once delivery and why it pairs with idempotency
Distributed messaging systems (queues, webhooks, event buses) guarantee at-least-once delivery: a message will eventually reach a consumer, possibly more than once. "Exactly-once" is mathematically hard to guarantee end-to-end across failures. The practical solution is to design consumers to be idempotent — receiving a duplicate message a second time changes nothing. Together: at-least-once delivery + idempotent consumers = effectively-once semantics from the perspective of state.
Do use a UUID v4 as the idempotency key generated on the client side, scoped to a single logical operation ("this specific checkout attempt"). Don't reuse the same key for different operations just to save code — if the body differs but the key matches, the server will replay the old result and silently ignore the new request. Some servers validate that the key maps to an identical payload and return a conflict error if it doesn't; build that check in.
"Design an idempotent create endpoint for a payments API." Walk through: (1) client generates a stable UUID and attaches it as Idempotency-Key; (2) server writes the key + pending status to a dedup table inside the same DB transaction as the payment; (3) on retry, the server reads the stored result and returns it, 200 not 201; (4) the key expires after 24 hours. Interviewers listen for "same transaction" or "atomic write" — that's the detail that prevents a race condition where two concurrent retries both pass the key-not-found check simultaneously. Mentioning the at-least-once delivery context elevates the answer to distributed-systems level.
Under the hood: how an idempotency-key store actually works
The phrase "store the key and return the response on retry" hides several important implementation details. Here is the exact mechanism, the state the store carries, and the race conditions you must handle.
The atomic insert-if-absent
The critical primitive is insert the key only if it does not already exist, atomically. In Redis this is SET key value NX (NX = "only if Not eXists"). In a SQL database it is an INSERT … ON CONFLICT DO NOTHING inside a transaction, or a unique constraint on the key column.
# Redis — atomic insert-if-absent
# Returns OK if the key was new, nil if it already existed
SET ik:a3f8c2d1-7b4e-4a9f-b3c2-1d8e9f0a2b4c "pending" NX EX 86400
# ^^ ^^ TTL: 24 h
# NX = only set if not exists
If two concurrent retries both reach Redis at the same instant, exactly one will receive OK and the other will receive nil — Redis is single-threaded for command execution, so there is no ambiguity. The winner proceeds to execute the business logic; the loser waits (see below).
State table: what the store holds
| Key state | Meaning | Action on new request |
|---|---|---|
| Absent | First time this key has been seen (or TTL expired) | SET NX → "pending"; proceed to execute business logic |
"pending" | A concurrent request is already executing under this key | Return 409 Conflict or spin-wait briefly, then retry the lookup; the in-flight request will overwrite "pending" with the final response |
{response_json} | Business logic completed; stored response is the canonical result | Return the stored response directly — no execution |
| Absent (expired TTL) | Key is past the retention window | Treat as a brand-new first attempt |
Concrete flow traced end-to-end
Handling a concurrent in-flight duplicate
The tricky case: two retries arrive simultaneously, neither has stored a final response yet, and the key holds "pending". Options:
- Return 409 immediately — "a request with this key is currently in-flight; wait and retry." Simple, but requires the client to handle 409.
- Short poll — wait 50–200 ms, re-check the key; if it transitions from "pending" to a final response within a short window, return it. Best for latency-sensitive clients.
- Use a distributed lock — acquire a lock on the key before executing; the second concurrent arrival blocks until the lock is released. More complex but avoids the 409 round-trip.
Key TTL and expiry policy
A 24-hour TTL is the Stripe standard for payment idempotency keys. The tradeoff: too short and a legitimate slow retry (hours later, due to a batch job) creates a duplicate; too long and the deduplication table grows without bound. Choose the TTL to match your retry window — the maximum time a client could reasonably retry the same operation.
-- SQL alternative: dedup table with partial index for active keys
CREATE TABLE idempotency_keys (
key TEXT PRIMARY KEY,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT NOT NULL, -- 'pending' | 'complete'
response JSONB -- null while pending
);
-- Insert atomically inside the same DB transaction as the business record
BEGIN;
INSERT INTO idempotency_keys(key, status)
VALUES('a3f8c2d1-…', 'pending')
ON CONFLICT(key) DO NOTHING;
-- IF 0 rows inserted, key existed → roll back, return stored response
INSERT INTO charges(...) VALUES(...);
COMMIT;
Inserting the idempotency key and the business record in the same transaction means they either both commit or both roll back. This prevents the failure mode where the business record is written but the key write fails, leaving no deduplication guard for retries.
How to debug & inspect it
Idempotency bugs usually manifest as either duplicate side effects (a key was not stored correctly, so retries re-execute) or phantom 409s (a key is permanently stuck in "pending" due to a crash mid-execution). Both are diagnosable by replaying requests and reading the dedup store directly.
Test by replaying the same key
Symptom → cause → fix
| Symptom | Likely cause | Fix |
|---|---|---|
| Double charge despite same Idempotency-Key on both requests | Key was not written to the dedup store after first execution (Redis write failed, or code path skipped it) | Ensure the key is written atomically with the business record; add an integration test that replays the same key and asserts no new charge is created |
| 409 Conflict on every retry, even long after the original request | Key is stuck in "pending" — the original handler crashed before updating the status to "complete" | Add a background job that marks keys older than N seconds as "failed" and stores an error response; clients can then retry with a new key |
| Key collision — different users share the same idempotency key value | The key was not scoped to (client_id, key) — a global key namespace lets two clients collide | Store and look up keys as (user_id, key) compound — never a raw global key |
| Stored error response is returned when client expects a success on retry | First attempt failed (e.g. card declined); that error was stored; retries replay the stored error | This is correct behaviour — a stored error is still the canonical result. Document it: clients must generate a new key to reattempt after a business-logic failure, not a network failure |
| 200 on replay but response body differs from original 201 | Stored response was truncated or serialised differently | Store the full serialised response body verbatim at the time of creation; deserialise it exactly on replay |
- In staging, always run a replay test after any change to the idempotency path: send the same key twice and assert the side-effect count (charges, rows) is exactly one.
- In production logs, correlate
Idempotency-Keywith the charge/order ID to confirm unique keys produce unique resources. - Monitor the 409 rate on idempotency-key endpoints — a spike indicates a stuck-pending problem or a client bug re-using keys.
- Check the dedup store TTL on every key write; a missing
EXPIREmeans keys accumulate forever.
In production: how leading APIs do it
Money-handling APIs converged on a remarkably consistent design for idempotency — not by coordination, but because every team that built payment infrastructure at scale hit the same failure modes and arrived at the same solution independently.
| Company | Header | Behaviour |
|---|---|---|
| Stripe | Idempotency-Key |
Client supplies a UUID on any POST (charges, payment intents, subscriptions). The server stores the operation result and replays the identical response for up to 24 hours. If the same key is sent with a different request body, Stripe returns a 422 to surface the likely programming error. Stripe's documentation explicitly states keys must be unique per logical operation and must not be reused across different operations. |
| PayPal | PayPal-Request-Id |
Client-generated ID attached to order creation, capture, and refund requests. PayPal deduplicates within a 72-hour window. The documentation calls it a "request ID" rather than an "idempotency key," but the mechanism is identical: server stores the response keyed by the header value and replays it on retry. |
| Adyen | Idempotency-Key |
Required on payment initiation and modification endpoints. Adyen's documentation specifies a maximum key length and recommends a UUID format. The deduplication window varies by endpoint type. Like Stripe, Adyen returns an error if the same key is reused with a different payload. |
| Square | idempotency_key (request body field) |
Square embeds the idempotency key in the JSON request body rather than a header — functionally equivalent, different placement. Used on payment creation, order creation, and refund endpoints. Square's documentation notes keys are retained for at least 48 hours. |
Why this exact design is the industry standard. Every payment API above independently arrived at the same three-part design: a client-supplied key, a server-side deduplicated result store, and a fixed expiry window. The convergence is not accidental — it reflects the hard constraints of the problem.
First, the key must be client-supplied. The server cannot generate a deduplication key because the key must be stable across network failures — the client must be able to resend the exact same key on retry, before it knows whether the first attempt succeeded. A server-generated key would require a successful round-trip to obtain, which is exactly the situation idempotency is designed to handle.
Second, the key must be stored with the result, not just the fact of execution. Storing only "this key was processed" and then re-executing to build the response on replay would introduce a second execution window. The stored response must be verbatim so that replays are indistinguishable from the original — same charge ID, same status, same body.
Third, the expiry window (24–72 hours across providers) reflects the practical retry window for automated systems. A batch job running daily might retry a failed payment up to 24 hours later; a retry window shorter than that would treat a legitimate late retry as a new operation and risk a duplicate. A window significantly longer than the maximum realistic retry interval wastes storage on keys that will never be replayed.
The result: client-supplied UUID + server-side result store + expiry window is the minimal correct design for safe retries on non-idempotent endpoints. Every variation across providers (header vs. body field, 24h vs. 72h, response on key conflict) is a parameter choice, not a structural difference.
🧠 Quick check
1. An operation is idempotent when:
Idempotency is about server-side state, not about response codes or call limits. The critical test is: does the second call produce any additional effect beyond what the first call already produced?
2. Which HTTP method is idempotent according to RFC 9110?
DELETE is idempotent: the first call removes the resource; subsequent calls find it already gone, but the resulting state (resource absent) is the same. POST creates a new resource on each call — not idempotent. PATCH's idempotency depends on the operation semantics.
3. A client POSTs a charge, receives a network timeout, and retries with the same Idempotency-Key. The server should:
The whole point of the key is to enable the server to return the stored result of the original operation rather than running it again. A 409 would be correct only if the same key was sent with a different payload — a likely programming error worth surfacing.
4. "At-least-once delivery" means a messaging system guarantees:
At-least-once delivery trades duplicate messages for delivery guarantees. It requires consumers to be idempotent to avoid processing the same event twice. "Exactly-once" requires expensive distributed coordination that is difficult to guarantee across all failure modes.
✍️ Exercise: design an idempotent subscription creation endpoint (try before opening)
You're building POST /v1/subscriptions for a SaaS billing API. Users must not be double-subscribed if a retry occurs. Describe the full flow from client to database, including the deduplication mechanism, race condition prevention, and expiry policy. You may use pseudo-code or prose.
Model answer:
- Client side: Generate a UUID v4 before the first attempt:
key = uuid4(). Send it asIdempotency-Key: <key>on every retry of this specific subscription creation. Never reuse the key for a different plan or user. - Server — receive request: Extract the key and query the dedup table:
SELECT * FROM idempotency_keys WHERE key = ? AND created_at > NOW() - INTERVAL '24h'. - Key found (retry path): Return the stored response body and status code immediately. Do not execute any business logic. Return
200, not201, to signal this is a replay. - Key not found (first-time path): Begin a database transaction. Insert the key with status
'pending'into the dedup table inside the same transaction as the subscription record. This prevents a race where two concurrent retries both pass the "key not found" check. Commit. Update the stored response to'complete'with the full response body. - Expiry: Keys older than 24 hours are treated as expired (a new request with the same key is a fresh operation). A nightly background job purges expired rows.
-- Pseudo-schema for the dedup table
CREATE TABLE idempotency_keys (
key TEXT PRIMARY KEY,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT, -- 'pending' | 'complete'
response JSONB -- stored full response
);
Rubric: ✓ UUID v4 on client ✓ dedup table lookup before execution ✓ atomic insert (same transaction as business record) ✓ 200 vs 201 distinction for replay ✓ expiry policy defined. Five out of five = senior-level answer; missing the "same transaction" point drops to mid-level.
Key takeaways
- An operation is idempotent if repeating it any number of times leaves server state unchanged from the first call.
- GET, HEAD, OPTIONS, PUT, and DELETE are idempotent by HTTP spec; POST is not.
- Network timeouts create ambiguity — clients must retry, but retries on non-idempotent endpoints cause duplicate side effects like double charges.
- Idempotency keys let the client signal "this is the same logical operation" and the server dedup accordingly, returning the stored result instead of re-executing.
- Atomic key insertion (inside the same DB transaction as the business record) prevents the race condition where two concurrent retries both pass the key-not-found check.
- At-least-once delivery + idempotent consumers = effectively-once semantics in distributed messaging.
Sources & further reading
- Stripe — Idempotent Requests — Stripe's production-proven explanation and implementation of idempotency keys
- IETF Draft — The Idempotency-Key HTTP Header Field — working draft standardising the header across HTTP APIs
- RFC 9110 § 9.2.2 — Idempotent Methods — the normative HTTP specification definition
- MDN — Idempotent (glossary) — quick reference with per-method classification