Interview Prep · Lesson 02

The API Design Cheat-Sheet

A portable, eight-step method for the whiteboard — plus the tables you'll reach for in every design interview. Open this once before an interview, then close it and go in confident.

⏱ 15 min Difficulty: advanced Prereq: Full course + Question Bank (Prep 01)

By the end you'll be able to

Drive any "design the API for X" prompt through a structured eight-step method without freezing.
Recall safe/idempotent properties, status codes, and style trade-offs from memory.
Tick off cross-cutting concerns (auth, idempotency, rate limiting, caching, pagination, versioning) before the interviewer has to prompt you.

The CREDO Method — 8 steps for any API design prompt

The acronym is just a memory aid; what matters is the logic flow. Use these eight steps in order. Don't skip ahead to endpoints before you've settled requirements — that's the single most common whiteboard mistake.

The CREDO method: Clarify → Resources → Endpoints → Data shapes → Errors → cross-cutting cOncerns → sCale → Evaluate. Iterate after step 8.

Clarify requirements & scale
Before touching a whiteboard, ask. "How many requests per second at peak?" "Is this public or internal?" "Who are the consumers — mobile, browser, server, third-party?" "What's the latency target?" "What consistency model is required — eventually consistent is fine, or must reads always see the latest write?" Surface the constraints that will drive every later decision. Aim to spend 2–3 minutes here. Interviewers reward candidates who ask — it demonstrates production thinking.
✅ Questions to always ask
- Expected read/write ratio and peak QPS?
- Public API or internal only?
- Multi-tenant (different orgs) or single-tenant?
- Real-time (WebSocket) or request-response (HTTP)?
- Idempotency requirement (payments, order placement)?
- Any legal/compliance constraints (GDPR, PCI)?
Identify resources / entities
List the nouns in the problem. "Design a food delivery API" → resources are: Customer, Restaurant, Menu, MenuItem, Order, OrderItem, Delivery, DeliveryPerson. Draw a simple entity map — which resources are top-level (have their own URI), which are sub-resources (live under a parent), and which are pure associations. Avoid verbs at this stage. Verbs become methods; nouns become resources. If you find yourself naming a resource with a verb (e.g., SendEmail), it probably belongs as an action on a noun resource (POST /emails).
Define endpoints & operations
Map CRUD operations to HTTP methods for each top-level resource. Then handle non-CRUD actions (transitions, triggers) as POST to a sub-resource path (e.g., POST /orders/{id}/cancel rather than a verb in the main path). Decide on path conventions: plural nouns, lowercase, hyphens not underscores, hierarchy depth ≤ 3 levels. Confirm which endpoints are read-heavy (optimize for caching) and which are write-heavy (optimize for durability).
```
# Example: Order resource CRUD + state transition
POST   /v1/orders              # create
GET    /v1/orders/{id}         # read one
GET    /v1/orders              # list (paginated)
PATCH  /v1/orders/{id}         # partial update
DELETE /v1/orders/{id}         # remove (if allowed)
POST   /v1/orders/{id}/cancel  # state transition
```
Request & response shapes
For each endpoint, sketch the request body and response body. Use real field names, types, and formats. ISO 8601 for timestamps. Enums as strings, not magic integers. Avoid leaking internal column names or storage-layer types. For collections, always include a consistent envelope: { data: [...], next_cursor: "…", total_estimate: N }. For singletons, return the resource directly. Use consistent naming (snake_case or camelCase — pick one and never mix).
```
# POST /v1/orders — request body
{
  "restaurant_id": "rest_99",
  "items": [
    { "menu_item_id": "item_7", "quantity": 2 }
  ],
  "delivery_address_id": "addr_44"
}

# 201 Created — response body
{
  "id": "ord_1234",
  "status": "pending",
  "created_at": "2026-06-20T10:00:00Z",
  "estimated_delivery_at": "2026-06-20T10:35:00Z"
}
```
Errors & status codes
Sketch the error contract as carefully as the success path. Use the right status code family (4xx = client fault, 5xx = server fault). Return a structured error body — a machine-readable code and a human-readable message at minimum: { "error": { "code": "ORDER_NOT_FOUND", "message": "Order ord_999 does not exist." } }. Never expose stack traces. Define the error catalog up front — interviewers notice when candidates treat errors as an afterthought.

Cross-cutting concerns

This is where most interview answers fall short. Work through this checklist explicitly for every design:

Concern	Default choice	When to deviate
Authentication	Bearer token (OAuth 2.0 / JWT) in `Authorization` header	Machine-to-machine: API key + secret. Cookie auth only for same-site browser flows.
Authorization	Check ownership inside every handler; never trust caller-supplied IDs alone	Add RBAC/ABAC for multi-role systems; use DB-level RLS for multi-tenant
Idempotency	Require `Idempotency-Key` header on all mutating, non-replayable operations	Omit for pure read endpoints; skip if the operation is naturally idempotent (PUT, DELETE)
Rate limiting	Per-client token bucket; return 429 + `Retry-After`	Separate limits for burst (per-second) vs sustained (per-day). Higher limits for trusted partners.
Pagination	Cursor-based for large/live datasets	Offset for small static admin lists that need arbitrary page jumps
Versioning	URI prefix: `/v1/…`	Header versioning for strict URL hygiene; use `Deprecation` + `Sunset` headers for lifecycle
Caching	Set `Cache-Control` on GET responses; use ETags for conditional requests	Distributed cache (Redis) for read-heavy computed data; CDN edge cache for public read-heavy GETs
Observability	Structured logs with trace ID on every request/response	Add metrics (p99 latency, error rate) per endpoint; distributed tracing (OpenTelemetry) for multi-service

Scale & latency budget
Work back from the requirements you captured in step 1. State your latency target (e.g., "p99 under 200 ms") and trace where time is spent: network (client ↔ gateway ↔ service ↔ DB). Identify the hot path — the critical read or write endpoint that accounts for 80%+ of traffic — and explain how you'd scale it: read replicas, caching, sharding, CDN offload, async processing. Mention which operations require strong consistency (financial ledger, inventory) and which tolerate eventual consistency (feed, recommendations, notifications).
Evaluate & iterate
Verbally score your own design: "The main weakness here is the fan-out write cost for high-follower accounts — I'd explore a hybrid push/pull strategy to address that." This signals self-awareness and production readiness. Ask the interviewer: "Does this look right for the scale you had in mind?" Iteration is expected and earns points. A candidate who asks for feedback is collaborating, not just performing.

Quick-reference tables

HTTP methods — safe & idempotent

Method	Safe?	Idempotent?	Has request body?	Typical use
`GET`	Yes	Yes	No (technically allowed, ignored)	Retrieve resource or collection
`HEAD`	Yes	Yes	No	Metadata check; existence test
`OPTIONS`	Yes	Yes	No	CORS preflight; capability discovery
`POST`	No	No	Yes	Create resource; trigger action
`PUT`	No	Yes	Yes	Full replacement of a known resource
`PATCH`	No	No*	Yes	Partial update
`DELETE`	No	Yes	Rarely	Remove resource

* PATCH is not required to be idempotent by RFC 9110, though implementations often make it so.

HTTP status-code families

Family	Meaning	Key codes to know
1xx	Informational / protocol switch	101 Switching Protocols (WebSocket upgrade)
2xx	Success	200 OK · 201 Created · 202 Accepted (async) · 204 No Content
3xx	Redirect / cache	301 Moved Permanently · 302 Found · 304 Not Modified (ETag match) · 307 Temp Redirect
4xx	Client error — fix your request	400 Bad Request · 401 Unauthenticated · 403 Forbidden · 404 Not Found · 409 Conflict · 410 Gone · 422 Unprocessable · 429 Too Many Requests
5xx	Server fault — retry may help	500 Internal Server Error · 502 Bad Gateway · 503 Service Unavailable · 504 Gateway Timeout

Which API style — REST vs GraphQL vs gRPC

Criterion	REST	GraphQL	gRPC
Best for	Public APIs, CRUD, anywhere caching matters	Complex client data requirements, rapid frontend iteration	Internal service-to-service, high throughput, streaming
Protocol	HTTP/1.1 or HTTP/2	HTTP/1.1 or HTTP/2 (single endpoint)	HTTP/2 required
Payload format	JSON (or any with content negotiation)	JSON	Protocol Buffers (binary)
Caching	Easy — URL is the cache key	Hard — all queries hit one URL	Not built-in; custom per use-case
Browser support	Native	Native (HTTP)	Needs gRPC-Web proxy
Over/under fetching	Client receives full resource shape	Client specifies exact fields needed	Schema-defined; no over-fetching
Streaming	SSE or WebSocket bolt-on	Subscriptions (WebSocket)	Native bidirectional streaming
Schema enforcement	Optional (OpenAPI)	Required (SDL)	Required (proto files)

Cross-cutting concerns checklist

🎯 Tick off every item before you say "done"

#	Concern	One-liner to say out loud
1	Authentication	"Every request carries a Bearer token validated at the gateway."
2	Authorization / ownership	"Handlers scope DB queries to the authenticated user/org — no BOLA."
3	Idempotency	"Mutating endpoints require an `Idempotency-Key`; the server deduplicates on it."
4	Rate limiting	"Token bucket per client in Redis; 429 + `Retry-After` on breach."
5	Pagination	"Cursor-based pagination; opaque cursor in response; no offset."
6	Versioning	"URI versioning (`/v1/…`); breaking changes = new version; deprecation header + sunset date."
7	Caching	"GET responses carry `Cache-Control: max-age` + ETag; Redis for computed aggregates."
8	Error contract	"Structured JSON error body with machine-readable code; no stack traces."
9	Observability	"Trace ID on every request; structured logs; p99 latency metric per endpoint."
10	TLS	"HTTPS only; HSTS; certificates auto-renewed."
11	Input validation	"Schema-validated deserialization at entry point; 400 on first invalid field."
12	Retries & timeouts	"Clients: exponential backoff + jitter. Server: timeout budget per downstream call."

Method as a flow

The blue dashed loop shows that step 8 feeds back into the design — you often revisit endpoints and shapes once scale or error constraints are clearer.

Patterns worth memorising

Async with polling

# Step 1: accept the work
POST /v1/exports
  ← 202 Accepted
  { "job_id": "job_77", "status_url": "/v1/exports/job_77" }

# Step 2: client polls until done
GET /v1/exports/job_77
  ← 200 { "status": "processing", "progress_pct": 62 }
  ← 200 { "status": "complete", "download_url": "..." }

Idempotent POST with Idempotency-Key

POST /v1/payments
Idempotency-Key: a3f9c2d1-…
Content-Type: application/json

{ "amount": 9900, "currency": "usd" }

# First call → executes charge → 201 Created
# Second call with same key → 200 OK (cached response, no double charge)

Cursor-based pagination response envelope

GET /v1/orders?limit=20&cursor=eyJpZCI6MjMwfQ

200 OK
{
  "data": [ /* 20 orders */ ],
  "next_cursor": "eyJpZCI6MjEwfQ",    // null if last page
  "has_more": true,
  "total_estimate": 3842                 // approximate
}

Structured error body

422 Unprocessable Entity
Content-Type: application/json

{
  "error": {
    "code": "VALIDATION_FAILED",
    "message": "Request validation failed.",
    "details": [
      { "field": "amount", "reason": "must be a positive integer" }
    ],
    "request_id": "req_9kx2m"
  }
}

⚠️ The three traps that sink design interviews

Jumping to endpoints before clarifying requirements. You design a WebSocket chat API and then learn the requirement is email-based async messaging. Always clarify first.
Forgetting cross-cutting concerns. A design with perfect endpoints but no mention of auth, rate limiting, or idempotency signals junior thinking. Work through the checklist in step 6.
Presenting a finished design as final. Real systems involve trade-offs. Name the weaknesses of your design, offer an alternative direction, and invite the interviewer's feedback.

🎯 Say this out loud at the start of every design interview

"Before I sketch any endpoints, let me understand the requirements: What scale are we designing for? Who consumes this API — internal services, browser clients, mobile, or third parties? And is there a consistency requirement I should be aware of?"

One sentence that signals you're thinking like a senior engineer, not a student rushing to write URLs on a whiteboard.

Key takeaways

Follow the CREDO method in order: Clarify → Resources → Endpoints → Data shapes → Errors → Cross-cutting concerns → Scale → Evaluate.
Never design endpoints before you understand who calls them and at what scale.
Cross-cutting concerns (auth, idempotency, rate limiting, caching, pagination, versioning) are where senior thinking is demonstrated — work through the checklist explicitly.
Name your design's weaknesses; iteration signals production experience.
Safe means no side effects; idempotent means N calls = 1 call in terms of state — memorise which method is which.

Sources & further reading

Roy Fielding — Architectural Styles and the Design of Network-based Software Architectures (REST constraints, original dissertation)
MDN — HTTP response status codes
MDN — HTTP request methods (safe / idempotent definitions)
RFC 9110 — HTTP Semantics (authoritative method definitions)
Stripe — Idempotent requests (production example of Idempotency-Key)
Google SRE Book — Service Level Objectives
OWASP API Security Top 10 — BOLA
AWS — Exponential Backoff and Jitter
OpenAPI Specification