Interview Prep · Lesson 02
The API Design Cheat-Sheet
A portable, eight-step method for the whiteboard — plus the tables you'll reach for in every design interview. Open this once before an interview, then close it and go in confident.
By the end you'll be able to
- Drive any "design the API for X" prompt through a structured eight-step method without freezing.
- Recall safe/idempotent properties, status codes, and style trade-offs from memory.
- Tick off cross-cutting concerns (auth, idempotency, rate limiting, caching, pagination, versioning) before the interviewer has to prompt you.
The CREDO Method — 8 steps for any API design prompt
The acronym is just a memory aid; what matters is the logic flow. Use these eight steps in order. Don't skip ahead to endpoints before you've settled requirements — that's the single most common whiteboard mistake.
-
Clarify requirements & scale
Before touching a whiteboard, ask. "How many requests per second at peak?" "Is this public or internal?" "Who are the consumers — mobile, browser, server, third-party?" "What's the latency target?" "What consistency model is required — eventually consistent is fine, or must reads always see the latest write?" Surface the constraints that will drive every later decision. Aim to spend 2–3 minutes here. Interviewers reward candidates who ask — it demonstrates production thinking.
✅ Questions to always ask- Expected read/write ratio and peak QPS?
- Public API or internal only?
- Multi-tenant (different orgs) or single-tenant?
- Real-time (WebSocket) or request-response (HTTP)?
- Idempotency requirement (payments, order placement)?
- Any legal/compliance constraints (GDPR, PCI)?
-
Identify resources / entities
List the nouns in the problem. "Design a food delivery API" → resources are: Customer, Restaurant, Menu, MenuItem, Order, OrderItem, Delivery, DeliveryPerson. Draw a simple entity map — which resources are top-level (have their own URI), which are sub-resources (live under a parent), and which are pure associations. Avoid verbs at this stage. Verbs become methods; nouns become resources. If you find yourself naming a resource with a verb (e.g., SendEmail), it probably belongs as an action on a noun resource (POST /emails).
-
Define endpoints & operations
Map CRUD operations to HTTP methods for each top-level resource. Then handle non-CRUD actions (transitions, triggers) as POST to a sub-resource path (e.g.,
POST /orders/{id}/cancelrather than a verb in the main path). Decide on path conventions: plural nouns, lowercase, hyphens not underscores, hierarchy depth ≤ 3 levels. Confirm which endpoints are read-heavy (optimize for caching) and which are write-heavy (optimize for durability).# Example: Order resource CRUD + state transitionPOST /v1/orders # create GET /v1/orders/{id} # read one GET /v1/orders # list (paginated) PATCH /v1/orders/{id} # partial update DELETE /v1/orders/{id} # remove (if allowed) POST /v1/orders/{id}/cancel # state transition -
Request & response shapes
For each endpoint, sketch the request body and response body. Use real field names, types, and formats. ISO 8601 for timestamps. Enums as strings, not magic integers. Avoid leaking internal column names or storage-layer types. For collections, always include a consistent envelope:
{ data: [...], next_cursor: "…", total_estimate: N }. For singletons, return the resource directly. Use consistent naming (snake_case or camelCase — pick one and never mix).# POST /v1/orders — request body{ "restaurant_id": "rest_99", "items": [ { "menu_item_id": "item_7", "quantity": 2 } ], "delivery_address_id": "addr_44" } # 201 Created — response body { "id": "ord_1234", "status": "pending", "created_at": "2026-06-20T10:00:00Z", "estimated_delivery_at": "2026-06-20T10:35:00Z" } -
Errors & status codes
Sketch the error contract as carefully as the success path. Use the right status code family (4xx = client fault, 5xx = server fault). Return a structured error body — a machine-readable code and a human-readable message at minimum:
{ "error": { "code": "ORDER_NOT_FOUND", "message": "Order ord_999 does not exist." } }. Never expose stack traces. Define the error catalog up front — interviewers notice when candidates treat errors as an afterthought. -
Cross-cutting concerns
This is where most interview answers fall short. Work through this checklist explicitly for every design:
Concern Default choice When to deviate Authentication Bearer token (OAuth 2.0 / JWT) in AuthorizationheaderMachine-to-machine: API key + secret. Cookie auth only for same-site browser flows. Authorization Check ownership inside every handler; never trust caller-supplied IDs alone Add RBAC/ABAC for multi-role systems; use DB-level RLS for multi-tenant Idempotency Require Idempotency-Keyheader on all mutating, non-replayable operationsOmit for pure read endpoints; skip if the operation is naturally idempotent (PUT, DELETE) Rate limiting Per-client token bucket; return 429 + Retry-AfterSeparate limits for burst (per-second) vs sustained (per-day). Higher limits for trusted partners. Pagination Cursor-based for large/live datasets Offset for small static admin lists that need arbitrary page jumps Versioning URI prefix: /v1/…Header versioning for strict URL hygiene; use Deprecation+Sunsetheaders for lifecycleCaching Set Cache-Controlon GET responses; use ETags for conditional requestsDistributed cache (Redis) for read-heavy computed data; CDN edge cache for public read-heavy GETs Observability Structured logs with trace ID on every request/response Add metrics (p99 latency, error rate) per endpoint; distributed tracing (OpenTelemetry) for multi-service -
Scale & latency budget
Work back from the requirements you captured in step 1. State your latency target (e.g., "p99 under 200 ms") and trace where time is spent: network (client ↔ gateway ↔ service ↔ DB). Identify the hot path — the critical read or write endpoint that accounts for 80%+ of traffic — and explain how you'd scale it: read replicas, caching, sharding, CDN offload, async processing. Mention which operations require strong consistency (financial ledger, inventory) and which tolerate eventual consistency (feed, recommendations, notifications).
-
Evaluate & iterate
Verbally score your own design: "The main weakness here is the fan-out write cost for high-follower accounts — I'd explore a hybrid push/pull strategy to address that." This signals self-awareness and production readiness. Ask the interviewer: "Does this look right for the scale you had in mind?" Iteration is expected and earns points. A candidate who asks for feedback is collaborating, not just performing.
Quick-reference tables
HTTP methods — safe & idempotent
| Method | Safe? | Idempotent? | Has request body? | Typical use |
|---|---|---|---|---|
GET | Yes | Yes | No (technically allowed, ignored) | Retrieve resource or collection |
HEAD | Yes | Yes | No | Metadata check; existence test |
OPTIONS | Yes | Yes | No | CORS preflight; capability discovery |
POST | No | No | Yes | Create resource; trigger action |
PUT | No | Yes | Yes | Full replacement of a known resource |
PATCH | No | No* | Yes | Partial update |
DELETE | No | Yes | Rarely | Remove resource |
* PATCH is not required to be idempotent by RFC 9110, though implementations often make it so.
HTTP status-code families
| Family | Meaning | Key codes to know |
|---|---|---|
| 1xx | Informational / protocol switch | 101 Switching Protocols (WebSocket upgrade) |
| 2xx | Success | 200 OK · 201 Created · 202 Accepted (async) · 204 No Content |
| 3xx | Redirect / cache | 301 Moved Permanently · 302 Found · 304 Not Modified (ETag match) · 307 Temp Redirect |
| 4xx | Client error — fix your request | 400 Bad Request · 401 Unauthenticated · 403 Forbidden · 404 Not Found · 409 Conflict · 410 Gone · 422 Unprocessable · 429 Too Many Requests |
| 5xx | Server fault — retry may help | 500 Internal Server Error · 502 Bad Gateway · 503 Service Unavailable · 504 Gateway Timeout |
Which API style — REST vs GraphQL vs gRPC
| Criterion | REST | GraphQL | gRPC |
|---|---|---|---|
| Best for | Public APIs, CRUD, anywhere caching matters | Complex client data requirements, rapid frontend iteration | Internal service-to-service, high throughput, streaming |
| Protocol | HTTP/1.1 or HTTP/2 | HTTP/1.1 or HTTP/2 (single endpoint) | HTTP/2 required |
| Payload format | JSON (or any with content negotiation) | JSON | Protocol Buffers (binary) |
| Caching | Easy — URL is the cache key | Hard — all queries hit one URL | Not built-in; custom per use-case |
| Browser support | Native | Native (HTTP) | Needs gRPC-Web proxy |
| Over/under fetching | Client receives full resource shape | Client specifies exact fields needed | Schema-defined; no over-fetching |
| Streaming | SSE or WebSocket bolt-on | Subscriptions (WebSocket) | Native bidirectional streaming |
| Schema enforcement | Optional (OpenAPI) | Required (SDL) | Required (proto files) |
Cross-cutting concerns checklist
| # | Concern | One-liner to say out loud |
|---|---|---|
| 1 | Authentication | "Every request carries a Bearer token validated at the gateway." |
| 2 | Authorization / ownership | "Handlers scope DB queries to the authenticated user/org — no BOLA." |
| 3 | Idempotency | "Mutating endpoints require an Idempotency-Key; the server deduplicates on it." |
| 4 | Rate limiting | "Token bucket per client in Redis; 429 + Retry-After on breach." |
| 5 | Pagination | "Cursor-based pagination; opaque cursor in response; no offset." |
| 6 | Versioning | "URI versioning (/v1/…); breaking changes = new version; deprecation header + sunset date." |
| 7 | Caching | "GET responses carry Cache-Control: max-age + ETag; Redis for computed aggregates." |
| 8 | Error contract | "Structured JSON error body with machine-readable code; no stack traces." |
| 9 | Observability | "Trace ID on every request; structured logs; p99 latency metric per endpoint." |
| 10 | TLS | "HTTPS only; HSTS; certificates auto-renewed." |
| 11 | Input validation | "Schema-validated deserialization at entry point; 400 on first invalid field." |
| 12 | Retries & timeouts | "Clients: exponential backoff + jitter. Server: timeout budget per downstream call." |
Method as a flow
Patterns worth memorising
Async with polling
# Step 1: accept the work
POST /v1/exports
← 202 Accepted
{ "job_id": "job_77", "status_url": "/v1/exports/job_77" }
# Step 2: client polls until done
GET /v1/exports/job_77
← 200 { "status": "processing", "progress_pct": 62 }
← 200 { "status": "complete", "download_url": "..." }
Idempotent POST with Idempotency-Key
POST /v1/payments
Idempotency-Key: a3f9c2d1-…
Content-Type: application/json
{ "amount": 9900, "currency": "usd" }
# First call → executes charge → 201 Created
# Second call with same key → 200 OK (cached response, no double charge)
Cursor-based pagination response envelope
GET /v1/orders?limit=20&cursor=eyJpZCI6MjMwfQ
200 OK
{
"data": [ /* 20 orders */ ],
"next_cursor": "eyJpZCI6MjEwfQ", // null if last page
"has_more": true,
"total_estimate": 3842 // approximate
}
Structured error body
422 Unprocessable Entity
Content-Type: application/json
{
"error": {
"code": "VALIDATION_FAILED",
"message": "Request validation failed.",
"details": [
{ "field": "amount", "reason": "must be a positive integer" }
],
"request_id": "req_9kx2m"
}
}
- Jumping to endpoints before clarifying requirements. You design a WebSocket chat API and then learn the requirement is email-based async messaging. Always clarify first.
- Forgetting cross-cutting concerns. A design with perfect endpoints but no mention of auth, rate limiting, or idempotency signals junior thinking. Work through the checklist in step 6.
- Presenting a finished design as final. Real systems involve trade-offs. Name the weaknesses of your design, offer an alternative direction, and invite the interviewer's feedback.
"Before I sketch any endpoints, let me understand the requirements: What scale are we designing for? Who consumes this API — internal services, browser clients, mobile, or third parties? And is there a consistency requirement I should be aware of?"
One sentence that signals you're thinking like a senior engineer, not a student rushing to write URLs on a whiteboard.
Key takeaways
- Follow the CREDO method in order: Clarify → Resources → Endpoints → Data shapes → Errors → Cross-cutting concerns → Scale → Evaluate.
- Never design endpoints before you understand who calls them and at what scale.
- Cross-cutting concerns (auth, idempotency, rate limiting, caching, pagination, versioning) are where senior thinking is demonstrated — work through the checklist explicitly.
- Name your design's weaknesses; iteration signals production experience.
- Safe means no side effects; idempotent means N calls = 1 call in terms of state — memorise which method is which.
Sources & further reading
- Roy Fielding — Architectural Styles and the Design of Network-based Software Architectures (REST constraints, original dissertation)
- MDN — HTTP response status codes
- MDN — HTTP request methods (safe / idempotent definitions)
- RFC 9110 — HTTP Semantics (authoritative method definitions)
- Stripe — Idempotent requests (production example of Idempotency-Key)
- Google SRE Book — Service Level Objectives
- OWASP API Security Top 10 — BOLA
- AWS — Exponential Backoff and Jitter
- OpenAPI Specification