Architectural Styles · Lesson 03

RESTful APIs in practice

Knowing REST's constraints is the theory; shipping an API that callers love requires a second layer of craft — good resource names, the right status codes, sane pagination, and an error shape that doesn't make clients guess.

⏱ 14 min Difficulty: core Prereq: REST style (as-02)

By the end you'll be able to

Locate any API on the Richardson Maturity Model and describe what it would take to move it one level up.
Apply resource naming rules (plural nouns, nesting depth, query params for filters).
Choose between cursor-based and offset-based pagination and explain the trade-offs.

Richardson Maturity Model — where does your API sit?

Leonard Richardson described a four-level ladder for measuring how "RESTful" an HTTP API really is. It's a useful diagnostic tool, not a mandate — many successful APIs live happily at L2.

The Richardson Maturity Model. L2 is the practical target for most public APIs. L3 is conceptually pure but operationally rare.

Resource naming rules

Good resource names are like good street addresses — they tell you where something lives without ambiguity. Four rules cover the vast majority of real cases:

Use plural nouns for collections: /users, /orders, /products. Singular for a specific item: /users/42.
Nest to show ownership, not hierarchy. /users/42/addresses is fine; /users/42/orders/7/items/3/reviews is too deep — flatten it to /reviews/XYZ if the sub-resource has its own identity.
Keep to two nesting levels max. Beyond /parent/:id/child things get unwieldy. If you need deeper, it's a sign the sub-resource deserves its own top-level collection.
Filters, sorts, and searches go in query params, not in the path. /products?category=books&sort=price_asc — not /products/category/books/sort/price_asc.

HTTP status codes — get them right

Status codes are the API's way of speaking the HTTP language. Returning 200 OK for "resource not found" forces clients to parse the body to detect errors — a classic leaky abstraction.

Code	Meaning	When to use
`200`	OK	Successful GET, PATCH, PUT (with body)
`201`	Created	Successful POST that created a resource; include `Location` header
`204`	No Content	Successful DELETE or PUT (no body to return)
`400`	Bad Request	Invalid input — tell the client exactly what was wrong
`401`	Unauthorized	Missing or invalid credentials (authentication problem)
`403`	Forbidden	Authenticated but not permitted (authorization problem)
`404`	Not Found	Resource doesn't exist
`409`	Conflict	State conflict (duplicate email, optimistic lock clash)
`422`	Unprocessable Entity	Semantically invalid request (business rule violation)
`429`	Too Many Requests	Rate limiting — include `Retry-After`
`500`	Internal Server Error	Unexpected server failure — never expose stack traces

Pagination: cursor vs offset

No one returns all 10 million records at once. Pagination splits large collections into pages. Two dominant approaches:

Aspect	Offset pagination	Cursor pagination
URL pattern	`?page=3&per_page=25` or `?offset=50&limit=25`	`?after=cursor_abc123&limit=25`
How it works	Skip N rows in the DB query	Query WHERE id > last_seen_id
Performance at scale	Degrades — large OFFSETs scan many rows	Stable — indexed seek, constant cost
Consistent pages	Drifts if rows are inserted/deleted mid-read	Stable — cursor anchors to a specific row
Random access	Yes — can jump to page 7	No — must walk forward from last cursor
Best for	Small datasets, admin UIs, "page 3 of 5" UX	Infinite scroll, high-volume feeds, real-time data

Consistent error shape

Every error response should have the same structure so clients can handle errors generically without reading documentation for each endpoint. A battle-tested shape inspired by RFC 7807 (Problem Details):

HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json

{
  "type":   "https://api.example.com/errors/validation-failed",
  "title":  "Validation failed",
  "status": 422,
  "detail": "The request body contains invalid fields.",
  "errors": [
    { "field": "email",    "msg": "must be a valid email address" },
    { "field": "birth_year", "msg": "must be between 1900 and 2025"    }
  ]
}

A clean endpoint set — worked example

# Collection of articles
GET    /articles                          # list (with ?page= or ?after=)
POST   /articles                          # create → 201 + Location

# Single article
GET    /articles/slug-of-article           # read → 200
PATCH  /articles/slug-of-article           # partial update → 200
DELETE /articles/slug-of-article           # remove → 204

# Sub-resource (comments belong to an article)
GET    /articles/slug-of-article/comments  # list comments
POST   /articles/slug-of-article/comments  # add comment

# Filtering + pagination via query params
GET    /articles?author=42&sort=newest&after=cur_xyz

🎯 Interview angle

If asked "how would you paginate a large list endpoint?", don't just say "use page numbers." Walk through the trade-off: offset pagination is simple but degrades at high pages; cursor pagination is stable and scales. Then say which you'd pick for the described use-case and why. Mentioning the drifting-pages problem (records inserted mid-read causing items to be skipped or duplicated) shows you've thought past the basics.

⚠️ Common trap — chatty N+1 designs

A chatty API forces clients to make many small requests to assemble a single screen. The classic N+1 pattern: one call to list 20 orders, then 20 individual calls to fetch each order's customer. Design responses to carry enough related data for the common use-case. Use sparse fieldsets (?fields=id,name) or include parameters (?include=customer) to let callers request related data in one trip without always over-fetching.

✅ Do this, not that

Do pick one error shape for your entire API and document it on the first page. Clients write error-handling code once and it works everywhere. Don't have some endpoints return {"message":"Not found"}, others return {"error":{"code":404}}, and others return a plain string — it forces clients to write fragile, endpoint-specific error parsers.

Under the hood: how pagination actually works

Offset pagination — the SQL it generates and why it degrades

When a client requests ?page=3&per_page=25, the server translates this to OFFSET 50 LIMIT 25 in SQL. The database must scan and discard the first 50 rows before returning yours. At page 1 this is negligible; at page 1000 (offset 24975) the database scans ~25,000 rows just to throw them away. This is called the "deep offset problem."

# Request:   GET /articles?page=3&per_page=25
# SQL:
SELECT * FROM articles ORDER BY created_at DESC OFFSET 50 LIMIT 25;

-- Pages 1–5: fast (small offset)
-- Page 1000:
SELECT * FROM articles ORDER BY created_at DESC OFFSET 24975 LIMIT 25;
-- DB scans 25,000 rows to return 25

-- Explain plan concept:
-- OFFSET 24975 → full index scan of 24,975 rows discarded → returns 25

Inserts and deletes also cause page drift: if a new article is inserted between page 1 and page 2 requests, every subsequent page shifts by one row — page 2 now overlaps with page 1 by one item, or skips one item entirely.

Cursor pagination — the actual encoding and SQL

A cursor is an opaque token the server issues. The client passes it back verbatim. The server decodes it to know where to continue from. The most common encoding is base64url of the last-seen primary key (or composite sort key).

# 1. Client requests first page:
GET /articles?limit=25

# 2. Server returns 25 articles. Last one has id=1042, created_at=2024-03-15T10:30:00Z.
#    Server encodes the cursor:
cursor_data = {"id": 1042, "created_at": "2024-03-15T10:30:00Z"}
cursor      = base64url(JSON.stringify(cursor_data))
            = "eyJpZCI6MTA0MiwiY3JlYXRlZF9hdCI6IjIwMjQtMDMtMTVUMTA6MzA6MDBaIn0"
# Response: {"data": [...], "next_cursor": "eyJpZCI6MTA0Mi..."}

# 3. Client requests next page:
GET /articles?after=eyJpZCI6MTA0Mi...&limit=25

# 4. Server decodes cursor and generates a keyset seek:
SELECT * FROM articles
WHERE  (created_at, id) < ('2024-03-15T10:30:00Z', 1042)
ORDER BY created_at DESC, id DESC
LIMIT  25;
-- Index seek directly — constant cost regardless of "page depth"

The cursor is opaque to the client — they cannot construct a cursor for an arbitrary position, which is why cursor pagination cannot support "jump to page 7." The base64 encoding is not for security; it simply discourages clients from parsing or constructing cursors manually. Some APIs use an HMAC-signed cursor for stronger enforcement.

The Link header — RFC 5988 pagination

The HTTP Link header carries URL references with semantic relations. For pagination the standard rels are next, prev, first, and last. Clients that understand Link need not know your API's query parameter names.

HTTP/1.1 200 OK
Content-Type: application/json
Link: <https://api.example.com/articles?after=eyJpZCI6MTA0Mi...&limit=25>; rel="next",
      <https://api.example.com/articles?before=eyJpZCI6OTk5...&limit=25>;  rel="prev",
      <https://api.example.com/articles?limit=25>;                          rel="first"
X-Total-Count: 1247

{"data": [...25 articles...], "next_cursor": "eyJpZCI6MTA0Mi..."}

X-Total-Count is a common convention (not a standard) for communicating the total number of items. Cursor pagination often omits total_count because counting all rows requires a full table scan — only provide it when the UI genuinely needs it.

Idempotency in REST — what it really means for retries

Idempotency means calling an operation once or N times produces the same server state. This is what makes retries safe. GET, PUT, and DELETE are idempotent. POST is not — retrying a POST that timed out may create a duplicate.

# Client sends a POST to create an order:
POST /orders HTTP/1.1
Host: api.example.com
Authorization: Bearer tok_...
Idempotency-Key: 7f3d9c2a-4b1e-48f0-b8c3-2d5e7a9f1b0c  ← UUID the client generates

{"product_id": "prod_abc", "quantity": 2}

← HTTP/1.1 201 Created
← Location: /orders/ord_888

# Network drops. Client retries with the SAME Idempotency-Key:
POST /orders HTTP/1.1
Idempotency-Key: 7f3d9c2a-4b1e-48f0-b8c3-2d5e7a9f1b0c  ← same key

← HTTP/1.1 200 OK          ← 200 not 201 — server recognises it
← Location: /orders/ord_888 ← same resource, no duplicate created

This pattern is used by Stripe for payments, Twilio for SMS, and most financial APIs where duplicate operations are costly. The server stores (key → result) in a cache or database. Stripe recommends the client generate a UUID per "intent," not per attempt.

Offset pagination scans and discards every prior row before reaching the requested page. Cursor pagination performs a keyset seek directly to the anchor row — cost is constant regardless of depth.

⚠️ Pitfall — never expose cursor internals

If clients can decode your cursor and see {"id": 1042}, some will start constructing cursors manually or relying on the id value in their own logic. Base64 is not encryption — use an opaque token (such as an HMAC-signed cursor) if clients must not be able to infer the underlying sort key.

How to debug & inspect it

Pagination bugs are among the most common REST API issues — duplicate items, missing items, incorrect total counts, and unexpectedly slow requests at high page numbers. The tools that expose them are curl, the Link header, and database slow-query logs.

# 1. Inspect the Link header and cursor in a paginated response: $ curl -si "https://api.example.com/articles?limit=5" | grep -E "^(Link|X-Total|HTTP)" HTTP/2 200 Link: <https://api.example.com/articles?after=eyJpZCI6OTk2...>; rel="next" X-Total-Count: 1247 # 2. Walk two pages and check for duplicate items (a drift test): $ curl -s "https://api.example.com/articles?limit=3" | jq '[.data[].id]' [100, 99, 98] $ curl -s "https://api.example.com/articles?after=CURSOR&limit=3" | jq '[.data[].id]' [97, 96, 95] ← correct; if you saw [98, 97, 96] that is page drift # 3. Test deep-offset performance with time: $ time curl -s "https://api.example.com/articles?page=1&per_page=25" > /dev/null real 0m0.042s $ time curl -s "https://api.example.com/articles?page=1000&per_page=25" > /dev/null real 0m2.341s ← 55× slower — deep offset problem confirmed

Symptom	Cause	Fix
Items appear on two consecutive pages	Offset drift — a new item was inserted between the two requests, shifting the window	Switch to cursor pagination; cursors anchor to a specific row regardless of inserts
An item is missing from the paginated results entirely	Offset drift in the other direction — an item was deleted, causing a gap	Cursor pagination; or accept minor inconsistency and document it
Late pages are 10–100× slower than early pages	Deep offset — database scanning and discarding all prior rows before each page	Migrate to keyset/cursor pagination; add a composite index on `(sort_col, id)`
`next_cursor` in response is the same as the current cursor	Server bug — cursor not advancing; infinite loop risk for clients following `Link: next`	Verify the cursor encodes the last item of the current page, not the first
Client gets `400` on a valid-looking cursor	Cursor expired, client modified the base64 value, or schema change made the cursor invalid	Treat cursors as ephemeral (add a TTL); return a helpful error message; never parse cursors client-side
POST without `Idempotency-Key` creates duplicates on retry	Network timeout caused client to retry; server treated each request as new	Add `Idempotency-Key` support server-side; document its use for all state-changing POST operations

Debug checklist:

Check for a Link header in paginated responses — it should contain at minimum rel="next". If absent, clients must construct URLs manually, creating coupling.
Walk two pages and diff the IDs — any ID overlap or gap indicates offset drift.
Benchmark page 1 vs page 100 vs page 1000 response times. If they grow linearly you have a deep offset problem.
For cursor APIs: confirm the cursor advances each page (page N cursor !== page N+1 cursor). Confirm the client uses Link: next instead of constructing cursors.
For POST with Idempotency-Key: replay the exact request with the same key and verify the server returns the same result without creating a duplicate resource.

In production: how leading APIs do it

Stripe, GitHub, and Slack each paginate large collections, but they made different design choices that reflect the REST maturity level they were aiming for and the access patterns their consumers actually exhibit.

Provider	Pagination style	Parameters	Navigation signal	REST angle	Docs
Stripe	Cursor (object-ID based)	`limit` + `starting_after` or `ending_before` (an object id)	`has_more: true/false` in the response body	L2 with deliberate cursor design; no Link header; cursor is a plain object ID (not encoded), making it easy to inspect but coupling clients to the id field	Stripe API pagination
GitHub	Offset (`page` + `per_page`)	`page` (1-based) + `per_page` (max 100)	RFC 5988 `Link` header with `rel="next"`, `rel="prev"`, `rel="first"`, `rel="last"`	Closest to L3 in this comparison — hypermedia links in the response let clients navigate without constructing URLs; offset is simpler for repo browsing where random-page access matters	GitHub REST pagination
Slack	Cursor (opaque token)	`cursor` + optional `limit`	`response_metadata.next_cursor` in the response body; empty string when exhausted	L2; cursor is a fully opaque token (not an object id), matching the ideal described in this lesson — clients can never decode or construct it	Slack API pagination

Deep dive: why cursors win for large sets — and when offset is still the right call

Stripe's starting_after and Slack's next_cursor are both cursor-based, but they expose different levels of abstraction to callers. Stripe's cursor is a plain object ID — you pass starting_after=ch_3NkL... and Stripe issues WHERE id > 'ch_3NkL...' LIMIT n against its charges table. Clients can see the id and reason about it, but they are also coupled to it: if Stripe ever changed the sort key, the cursor format would break. Slack's cursor is an opaque token whose internal structure is undocumented — clients echo it back but cannot interpret it, which gives Slack the freedom to change internal sort keys without a breaking change.

GitHub's offset approach — page=3&per_page=30 — is a deliberate concession to use cases where random-page access matters. Browsing repository issues where a user wants to jump to "page 5 of 12" requires offset semantics; a cursor-only API would force the client to walk the first four pages first. The RFC 5988 Link header compensates for the navigability problem by giving clients machine-readable next/prev/first/last URLs so they do not need to construct page parameters manually. For access patterns that are genuinely sequential — infinite scroll, background sync, data export — cursor pagination is the only choice that remains fast as the dataset grows. For access patterns that require random-page access in a bounded, low-write dataset, offset with a Link header (the GitHub approach) remains a pragmatic choice.

How leading APIs do it

🧠 Quick check

1. You create a new user successfully with POST /users. The correct HTTP status is:

201 Created tells the client a resource was created. The Location header gives the URL of the new resource so the client can fetch it without guessing. 200 is for successful reads/updates; 204 is for no-body responses (typically DELETE).

2. An infinite-scroll feed at high volume. Which pagination strategy is better?

Cursor pagination is the right fit: constant-cost indexed seek, no duplicate/missing records when new items arrive, and it maps naturally onto "load more." Offset pagination degrades as the offset grows and produces inconsistent results in live feeds.

3. Where do filtering and sorting parameters belong?

Query parameters are the right vehicle for optional modifiers on a collection. Path segments identify resources; when category is a filter rather than a distinct resource, it belongs in the query string. GET bodies are technically permitted but widely unsupported.

4. An L0 API on the Richardson Maturity Model uses:

L0 is the swamp — one URI, one method (usually POST), action encoded in the body or query string. It uses HTTP as a transport tunnel, not as a semantic layer.

✍️ Exercise: critique and fix a real-world API design (try before opening)

A colleague proposes this API surface for an e-commerce platform. Identify as many issues as you can and propose fixes:

GET /getProductList POST /product/create POST /product/delete?id=5 GET /order?userId=10&action=fetchAll POST /updateOrderStatus

Issues found:

Verbs in URLs (getProductList, create, delete, fetchAll, updateOrderStatus)
Inconsistent naming — singular /product vs plural implied by list
Using POST for a delete operation (ignores HTTP method semantics)
Action in a query param (action=fetchAll) instead of method

Fixed design:

GET /products POST /products DELETE /products/5 GET /users/10/orders PATCH /orders/:id (body: {"status": "shipped"})

Rubric: ✓ verbs removed from URLs ✓ plural nouns used ✓ DELETE used for deletion ✓ PATCH used for partial update ✓ status as a field in the body, not a URL action.

Key takeaways

The Richardson Maturity Model (L0–L3) is a practical ladder for measuring REST compliance; L2 is the common production target.
Resource names: plural nouns, max two nesting levels, filters in query params.
Status codes carry semantic meaning — use 201 Created, 204 No Content, 409 Conflict precisely.
Cursor pagination scales better than offset for large or live datasets.
A consistent error shape (RFC 7807 style) lets clients write error handling once.