API Design

Foundations · Lesson 07

HTTP & how it evolved

HTTP is the language almost every web API speaks. It's deliberately simple: the client sends a request, the server sends back a response, and the conversation is over. Learn its five-word vocabulary once and you can read any web API on earth.

⏱ 12 minDifficulty: introPrereq: Lesson 01

By the end you'll be able to

The shape of a conversation

HTTP stands for HyperText Transfer Protocol. "Protocol" just means an agreed format so both sides understand each other — like the fixed phrasing of a radio call ("over", "roger"). Every exchange has exactly two messages:

REQUEST GET /v1/users/42 HTTP/1.1 ← method · path · version Host: api.example.com Accept: application/json Authorization: Bearer … ← headers (empty body for GET) ← body RESPONSE HTTP/1.1 200 OK ← version · status Content-Type: application/json Cache-Control: max-age=60 ← headers { "id": 42, "name": "Ada" } ← body
Same three-part structure on both sides: a first line, a block of headers, then an optional body.

The verbs: HTTP methods

The method says what you want to do to the resource at that path. Five cover almost everything:

MethodMeansSafe?Idempotent?
GETRead a resourceYes (no change)Yes
POSTCreate / trigger an actionNoNo
PUTReplace a resource wholesaleNoYes
PATCHPartially updateNoNo*
DELETERemove a resourceNoYes

Safe = doesn't change server state. Idempotent = doing it twice has the same effect as doing it once (deleting user 42 twice still leaves user 42 deleted). These two properties drive real decisions — like whether a client can safely retry after a timeout. We give idempotency its own lesson later because interviews love it.

🎯 Interview angle

"Why POST and not GET to create an order?" Because GET is safe and idempotent — caches, browsers, and crawlers may repeat it freely. If creating an order hid behind a GET, a prefetch or retry could place duplicate orders. Matching the method to the semantics isn't pedantry; it's what makes retries, caching, and crawling safe.

The reply code: status codes

The server's first line carries a three-digit status code. You only need the families:

RangeFamilyCommon ones
2xxSuccess200 OK, 201 Created, 204 No Content
3xxRedirect301 Moved, 304 Not Modified
4xxYou (client) erred400 Bad Request, 401 Unauthorized, 404 Not Found, 429 Too Many Requests
5xxThe server erred500 Internal Error, 503 Service Unavailable
⚠️ Common trap

Returning 200 OK with {"error": "not found"} in the body. Now every client must parse the body to learn it failed, and caches/monitoring think everything's fine. Let the status code carry the outcome: 404 means not found, full stop. The status line is part of the contract — use it.

Why HTTP kept evolving

The vocabulary above barely changed across versions. What changed is how the messages travel over the wire — each version fixed a speed problem the last one had.

HTTP/1.0 1996new TCPper request HTTP/1.1 1997reuse theconnection HTTP/2 2015many streams,one connection HTTP/3 2022over QUIC/UDP,no head-of-line block
Each version attacked latency: stop reopening connections → multiplex over one → drop the TCP bottleneck entirely.
✅ Do this, not that

You rarely choose the HTTP version by hand — the server and browser negotiate the best they both support. Do understand the trade so you can answer "would HTTP/2 help here?"; don't claim a version magically makes a single slow database query faster. These versions cut transport overhead, not your backend's work.

Under the hood: how it actually works

HTTP/1.1 is a text protocol — you can type a request by hand with telnet. HTTP/2 is a binary framing layer over the same semantics. Understanding both lets you read any protocol trace.

HTTP/1.1: raw bytes on the wire

An HTTP/1.1 request is literally this text, followed by \r\n\r\n (CRLF blank line separating headers from body):

## Exact bytes sent by the client (→ server)
GET /v1/users/42 HTTP/1.1\r\n
Host: api.example.com\r\n
Accept: application/json\r\n
Authorization: Bearer eyJhbGci...\r\n
\r\n
## Exact bytes the server sends back
HTTP/1.1 200 OK\r\n
Content-Type: application/json\r\n
Content-Length: 27\r\n
Cache-Control: max-age=60\r\n
\r\n
{"id":42,"name":"Ada Lovelace"}

Rules: headers are Name: Value\r\n; a blank line (\r\n) ends the header block; the body is everything after. Content-Length tells the receiver when to stop reading. For chunked responses (Transfer-Encoding: chunked), each chunk is prefixed with its byte count in hex instead.

The head-of-line blocking problem: in HTTP/1.1, requests on one connection are serialised. If you fire GET /a then GET /b on the same connection, /b cannot start until /a's response is fully received. Browsers work around this by opening up to 6 parallel connections per host — which wastes resources and adds TLS overhead.

HTTP/2: binary frames and multiplexed streams

HTTP/2 replaces the line-oriented text format with a binary framing layer. Every piece of data — headers, body data, stream control — is sent as a frame with a fixed 9-byte header:

## HTTP/2 frame structure (9-byte fixed header + payload)
+-----------------------------------------------+
|                 Length (24 bits)              |
+---------------+---------------+---------------+
|   Type (8)    |   Flags (8)   |
+-+-------------+---------------+-------------------------------+
|R|                Stream Identifier (31 bits)                  |
+=+=============================================================+
|                   Frame Payload (0 to 2^24-1 bytes)          |
+---------------------------------------------------------------+

## Key frame types
HEADERS   (0x1)  — request/response headers (HPACK-compressed)
DATA      (0x0)  — body payload
SETTINGS  (0x4)  — connection-level configuration
WINDOW_UPDATE (0x8) — flow control
PING      (0x6)  — keepalive
RST_STREAM (0x3) — cancel a specific stream

A stream is a virtual request/response pair identified by an integer ID (client uses odd IDs: 1, 3, 5…). Multiple streams can be interleaved on one TCP connection: a DATA frame for stream 1, then for stream 3, then for stream 1 again — the receiver reassembles them. This eliminates HTTP-level head-of-line blocking. A slow stream no longer blocks a fast one.

HPACK header compression

In HTTP/1.1, headers like Authorization: Bearer ... are re-sent verbatim on every request. HTTP/2 compresses them with HPACK: a static table of 61 common headers (e.g. :method GET is index 2) plus a dynamic table of headers seen so far. On the second request, the entire header set may compress from 400 bytes to a single-digit index. This is especially valuable for mobile APIs that send large Authorization or Cookie headers repeatedly.

HTTP/1.1 — 3 connections (serial per conn) conn 1: [ req A ][ resp A ][ req D ][ resp D ] conn 2: [ req B ][ resp B ] conn 3: [ req C ][ resp C ] 3 TCP connections, 3 TLS handshakes, requests queue per connection HTTP/2 — 1 connection (interleaved streams) s1 H s3 H s5 H s1 D s3 D s5 D s1 resp s3 resp s5 resp 1 TCP connection, 1 TLS handshake, streams fully interleaved s1/s3/s5 = stream IDs (client uses odd integers); H = HEADERS frame; D = DATA frame
HTTP/1.1 needs multiple connections to avoid queuing. HTTP/2 multiplexes all streams on one — a slow response on stream 3 does not delay stream 5.

How to debug & inspect it

HTTP is text-based enough (especially 1.1) that the raw conversation is readable. Three tools cover everything:

# See the full request/response conversation including headers $ curl -v https://api.example.com/v1/users/42 > GET /v1/users/42 HTTP/2 > host: api.example.com > accept: */* < HTTP/2 200 < content-type: application/json < cache-control: max-age=60 # Lines starting with > are sent; < are received; * are curl notes # Force HTTP/1.1 to compare (useful when debugging version-specific issues) $ curl -v --http1.1 https://api.example.com/v1/users/42 # Show only the response status code and total time $ curl -o /dev/null -s -w "HTTP %{http_code} total=%{time_total}s dns=%{time_namelookup}s connect=%{time_connect}s\n" \ https://api.example.com/v1/users/42 HTTP 200 total=0.183s dns=0.052s connect=0.072s # Show only response headers (no body) $ curl -I https://api.example.com/v1/users/42 HTTP/2 200 content-type: application/json cache-control: max-age=60

In Chrome / Firefox DevTools: open Network tab → click a request → see Headers (request + response), Preview (rendered body), Timing (DNS / TCP / TLS / TTFB / content breakdown). For HTTP/2, the Protocol column shows h2; for HTTP/3 it shows h3.

SymptomCauseFix
API returns 200 but body contains an error messageServer is misusing status codes — stuffing errors into 200This is the server's bug; escalate to fix it. Clients must parse body defensively.
405 Method Not AllowedSending POST to a GET-only endpoint, or vice versaCheck the API docs; check the Allow response header which lists accepted methods
400 Bad Request with no useful bodyMalformed JSON, missing required field, wrong content-type headerAdd -H "Content-Type: application/json"; pretty-print your JSON and validate it; re-read the request body spec
413 Payload Too LargeRequest body exceeds server's limitReduce payload size; use multipart upload for large files
503 Service UnavailableServer overloaded or down; often includes a Retry-After headerRespect Retry-After; implement exponential backoff
Response truncated mid-bodyContent-Length mismatch, proxy cutting connection, chunked encoding errorcurl -v → check if Content-Length matches actual bytes received; try with --no-buffer
"works in HTTP/1.1 but fails in HTTP/2"Server-side HTTP/2 bug; header case sensitivity (HTTP/2 lowercases all headers)Try curl --http1.1 to confirm version sensitivity; check for uppercase header names in server code

Debug checklist:

  1. Read the status code family first: 2xx = success, 4xx = your mistake, 5xx = server's mistake.
  2. Add -v to curl and read the response headers before assuming a body parse issue.
  3. Check Content-Type on both sides — sending JSON without Content-Type: application/json often causes 400.
  4. Use -w "%{http_code}" in scripts to capture the status code without parsing the body.
  5. For timing issues: DevTools → Network → Timing tab shows TTFB (Time To First Byte); a high TTFB is server-side, a high "Content Download" is body size / bandwidth.

By the numbers

Concrete scenario: a browser loads a page that needs 30 sub-resources (JS, CSS, images). Round-trip time (RTT) to the server is 50 ms. How long does the browser wait just for HTTP transport, ignoring server processing time?

The governing formula — serial rounds vs. parallel fetch

HTTP/1.1 (6 parallel connections): rounds = ceil(N / max_conns) = ceil(30 / 6) = 5 serial rounds transport = rounds × RTT = 5 × 50 ms = 250 ms HTTP/2 (1 connection, all streams in parallel): rounds = 1 (all 30 streams sent together) transport = 1 × RTT = 1 × 50 ms = 50 ms # HTTP/1.1 browsers cap at 6 connections per origin (RFC 7230 §6.4). # HTTP/2 multiplexes all N requests on one connection in one round trip. # Assumes assets are independent (no dependency chain).

(RFC 7230 §6.3 — persistent connections and pipelining · RFC 9113 §5 — HTTP/2 streams and multiplexing)

Comparison table: HTTP/1.1 vs HTTP/2 as asset count grows

Assets (N)HTTP/1.1 rounds (⌈N/6⌉)HTTP/1.1 transport (ms)HTTP/2 roundsHTTP/2 transport (ms)Speedup
61501501.0× (tie — one round each)
1221001502.0×
3052501505.0×
601050015010.0×
1001785015017.0×

The HTTP/1.1 transport cost grows linearly with assets; HTTP/2 stays flat at one RTT regardless of N (until the TCP congestion window limits how much data fits in the first flight, but for typical pages that threshold is rarely reached). This is exactly the problem HTTP/2 was designed to solve — a 30-asset page loads in one round trip instead of five.

Worked trace — browser loading api.example.com/page at 08:30:00 UTC

RTT = 50 ms. Page needs: 1 HTML doc + 6 CSS files + 10 JS files + 13 images = 30 assets total.

t (ms)HTTP/1.1 eventHTTP/2 event
0TCP + TLS handshake (1 RTT TCP + 1 RTT TLS 1.3 = 100 ms total)TCP + TLS handshake — same 100 ms
100Fetch HTML (round 1, 1 asset on 1 conn) — arrives at 150 msFetch HTML on stream 1 — arrives at 150 ms
150Browser parses HTML; opens 5 more conns (5× 100 ms TLS) = arrives at 250 msBrowser sends all 29 remaining streams on same conn
250Round 2: 6 assets fetched; 23 remain → arrives at 300 msAll 29 assets arrive at 200 ms
300Round 3: 6 more assets → arrives at 350 msPage fully loaded — total: 200 ms
350Round 4: 6 more → arrives at 400 ms
400Round 5: 5 remaining → arrives at 450 ms
450Page fully loaded — total: 450 ms

Net result: HTTP/1.1 takes 450 ms; HTTP/2 takes 200 ms — a 2.25× speedup on a 30-asset page at 50 ms RTT. (The simpler formula above gives 250 ms vs 50 ms because it excludes the shared TLS handshake cost, which is identical for both.)

Decision math — when does multiplexing actually matter?

# HTTP/2 wins when the number of assets exceeds the parallel connection limit: # HTTP/1.1 rounds = ceil(N / 6); HTTP/2 rounds = 1 (always). # Break-even: HTTP/2 is only faster when ceil(N / 6) > 1, i.e. N > 6. # At N = 6: HTTP/1.1 = 1 round × RTT = HTTP/2. No gain. # At N = 7: HTTP/1.1 = 2 rounds. HTTP/2 saves 1 RTT = 50 ms. # At N = 30: HTTP/1.1 = 5 rounds. HTTP/2 saves 4 RTTs = 200 ms. # At N = 100: HTTP/1.1 = 17 rounds. HTTP/2 saves 16 RTTs = 800 ms. # The gain also scales with RTT: # At RTT = 10 ms (fast LAN), N=30: HTTP/2 saves 4 × 10 ms = 40 ms (small) # At RTT = 100 ms (mobile), N=30: HTTP/2 saves 4 × 100 ms = 400 ms (huge) # Rule of thumb: # Many small assets + high RTT → HTTP/2 wins decisively. # Few assets (≤ 6) OR low RTT (≤ 5 ms) → negligible difference.

This is why HTTP/2 is essentially mandatory for public APIs and web apps (where clients are global and RTTs vary) but less critical for internal service-to-service calls on a fast datacenter network with few concurrent requests per connection.

🧠 Quick check

1. Which method is both safe and idempotent?

GET only reads — it changes nothing (safe) and repeating it has the same effect (idempotent). That's exactly why GETs are cacheable and freely retryable.

2. A request asks for a user that doesn't exist. The best response is:

404 is the 4xx "client asked for something that isn't there" code. 200 hides the failure from caches and monitoring; 500 wrongly blames the server.

3. The main reason HTTP/3 moved onto QUIC/UDP was to:

HTTP/2 already multiplexed at the HTTP layer, but a single lost TCP packet still stalled every stream. QUIC keeps streams independent and shortens the handshake. Encryption and methods are unrelated to that choice.

✍️ Drill: design the request/response for "like a post"

Pick the method, path, status, and body for liking post 99 as user 7. Decide before opening.

POST /v1/posts/99/likes        # creating a "like" resource201 Created { "post_id": 99, "liked": true }

DELETE /v1/posts/99/likes      # unlike — idempotent204 No Content

Rubric: ✓ POST to create, DELETE to remove (not a single toggle GET) ✓ models "like" as a resource under the post ✓ 201 on create, 204 on delete ✓ notes DELETE is idempotent so a retry is safe. Bonus: returning 409/200 if already liked.

Key takeaways

Sources & further reading