Architectural Styles · Lesson 03
RESTful APIs in practice
Knowing REST's constraints is the theory; shipping an API that callers love requires a second layer of craft — good resource names, the right status codes, sane pagination, and an error shape that doesn't make clients guess.
By the end you'll be able to
- Locate any API on the Richardson Maturity Model and describe what it would take to move it one level up.
- Apply resource naming rules (plural nouns, nesting depth, query params for filters).
- Choose between cursor-based and offset-based pagination and explain the trade-offs.
Richardson Maturity Model — where does your API sit?
Leonard Richardson described a four-level ladder for measuring how "RESTful" an HTTP API really is. It's a useful diagnostic tool, not a mandate — many successful APIs live happily at L2.
Resource naming rules
Good resource names are like good street addresses — they tell you where something lives without ambiguity. Four rules cover the vast majority of real cases:
- Use plural nouns for collections:
/users,/orders,/products. Singular for a specific item:/users/42. - Nest to show ownership, not hierarchy.
/users/42/addressesis fine;/users/42/orders/7/items/3/reviewsis too deep — flatten it to/reviews/XYZif the sub-resource has its own identity. - Keep to two nesting levels max. Beyond
/parent/:id/childthings get unwieldy. If you need deeper, it's a sign the sub-resource deserves its own top-level collection. - Filters, sorts, and searches go in query params, not in the path.
/products?category=books&sort=price_asc— not/products/category/books/sort/price_asc.
HTTP status codes — get them right
Status codes are the API's way of speaking the HTTP language. Returning 200 OK for "resource not found" forces clients to parse the body to detect errors — a classic leaky abstraction.
| Code | Meaning | When to use |
|---|---|---|
200 | OK | Successful GET, PATCH, PUT (with body) |
201 | Created | Successful POST that created a resource; include Location header |
204 | No Content | Successful DELETE or PUT (no body to return) |
400 | Bad Request | Invalid input — tell the client exactly what was wrong |
401 | Unauthorized | Missing or invalid credentials (authentication problem) |
403 | Forbidden | Authenticated but not permitted (authorization problem) |
404 | Not Found | Resource doesn't exist |
409 | Conflict | State conflict (duplicate email, optimistic lock clash) |
422 | Unprocessable Entity | Semantically invalid request (business rule violation) |
429 | Too Many Requests | Rate limiting — include Retry-After |
500 | Internal Server Error | Unexpected server failure — never expose stack traces |
Pagination: cursor vs offset
No one returns all 10 million records at once. Pagination splits large collections into pages. Two dominant approaches:
| Aspect | Offset pagination | Cursor pagination |
|---|---|---|
| URL pattern | ?page=3&per_page=25 or ?offset=50&limit=25 |
?after=cursor_abc123&limit=25 |
| How it works | Skip N rows in the DB query | Query WHERE id > last_seen_id |
| Performance at scale | Degrades — large OFFSETs scan many rows | Stable — indexed seek, constant cost |
| Consistent pages | Drifts if rows are inserted/deleted mid-read | Stable — cursor anchors to a specific row |
| Random access | Yes — can jump to page 7 | No — must walk forward from last cursor |
| Best for | Small datasets, admin UIs, "page 3 of 5" UX | Infinite scroll, high-volume feeds, real-time data |
Consistent error shape
Every error response should have the same structure so clients can handle errors generically without reading documentation for each endpoint. A battle-tested shape inspired by RFC 7807 (Problem Details):
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json
{
"type": "https://api.example.com/errors/validation-failed",
"title": "Validation failed",
"status": 422,
"detail": "The request body contains invalid fields.",
"errors": [
{ "field": "email", "msg": "must be a valid email address" },
{ "field": "birth_year", "msg": "must be between 1900 and 2025" }
]
}
A clean endpoint set — worked example
# Collection of articles
GET /articles # list (with ?page= or ?after=)
POST /articles # create → 201 + Location
# Single article
GET /articles/slug-of-article # read → 200
PATCH /articles/slug-of-article # partial update → 200
DELETE /articles/slug-of-article # remove → 204
# Sub-resource (comments belong to an article)
GET /articles/slug-of-article/comments # list comments
POST /articles/slug-of-article/comments # add comment
# Filtering + pagination via query params
GET /articles?author=42&sort=newest&after=cur_xyz
If asked "how would you paginate a large list endpoint?", don't just say "use page numbers." Walk through the trade-off: offset pagination is simple but degrades at high pages; cursor pagination is stable and scales. Then say which you'd pick for the described use-case and why. Mentioning the drifting-pages problem (records inserted mid-read causing items to be skipped or duplicated) shows you've thought past the basics.
A chatty API forces clients to make many small requests to assemble a single screen. The classic N+1 pattern: one call to list 20 orders, then 20 individual calls to fetch each order's customer. Design responses to carry enough related data for the common use-case. Use sparse fieldsets (?fields=id,name) or include parameters (?include=customer) to let callers request related data in one trip without always over-fetching.
Do pick one error shape for your entire API and document it on the first page. Clients write error-handling code once and it works everywhere. Don't have some endpoints return {"message":"Not found"}, others return {"error":{"code":404}}, and others return a plain string — it forces clients to write fragile, endpoint-specific error parsers.
Under the hood: how pagination actually works
Offset pagination — the SQL it generates and why it degrades
When a client requests ?page=3&per_page=25, the server translates this to OFFSET 50 LIMIT 25 in SQL. The database must scan and discard the first 50 rows before returning yours. At page 1 this is negligible; at page 1000 (offset 24975) the database scans ~25,000 rows just to throw them away. This is called the "deep offset problem."
# Request: GET /articles?page=3&per_page=25
# SQL:
SELECT * FROM articles ORDER BY created_at DESC OFFSET 50 LIMIT 25;
-- Pages 1–5: fast (small offset)
-- Page 1000:
SELECT * FROM articles ORDER BY created_at DESC OFFSET 24975 LIMIT 25;
-- DB scans 25,000 rows to return 25
-- Explain plan concept:
-- OFFSET 24975 → full index scan of 24,975 rows discarded → returns 25
Inserts and deletes also cause page drift: if a new article is inserted between page 1 and page 2 requests, every subsequent page shifts by one row — page 2 now overlaps with page 1 by one item, or skips one item entirely.
Cursor pagination — the actual encoding and SQL
A cursor is an opaque token the server issues. The client passes it back verbatim. The server decodes it to know where to continue from. The most common encoding is base64url of the last-seen primary key (or composite sort key).
# 1. Client requests first page:
GET /articles?limit=25
# 2. Server returns 25 articles. Last one has id=1042, created_at=2024-03-15T10:30:00Z.
# Server encodes the cursor:
cursor_data = {"id": 1042, "created_at": "2024-03-15T10:30:00Z"}
cursor = base64url(JSON.stringify(cursor_data))
= "eyJpZCI6MTA0MiwiY3JlYXRlZF9hdCI6IjIwMjQtMDMtMTVUMTA6MzA6MDBaIn0"
# Response: {"data": [...], "next_cursor": "eyJpZCI6MTA0Mi..."}
# 3. Client requests next page:
GET /articles?after=eyJpZCI6MTA0Mi...&limit=25
# 4. Server decodes cursor and generates a keyset seek:
SELECT * FROM articles
WHERE (created_at, id) < ('2024-03-15T10:30:00Z', 1042)
ORDER BY created_at DESC, id DESC
LIMIT 25;
-- Index seek directly — constant cost regardless of "page depth"
The cursor is opaque to the client — they cannot construct a cursor for an arbitrary position, which is why cursor pagination cannot support "jump to page 7." The base64 encoding is not for security; it simply discourages clients from parsing or constructing cursors manually. Some APIs use an HMAC-signed cursor for stronger enforcement.
The Link header — RFC 5988 pagination
The HTTP Link header carries URL references with semantic relations. For pagination the standard rels are next, prev, first, and last. Clients that understand Link need not know your API's query parameter names.
HTTP/1.1 200 OK
Content-Type: application/json
Link: <https://api.example.com/articles?after=eyJpZCI6MTA0Mi...&limit=25>; rel="next",
<https://api.example.com/articles?before=eyJpZCI6OTk5...&limit=25>; rel="prev",
<https://api.example.com/articles?limit=25>; rel="first"
X-Total-Count: 1247
{"data": [...25 articles...], "next_cursor": "eyJpZCI6MTA0Mi..."}
X-Total-Count is a common convention (not a standard) for communicating the total number of items. Cursor pagination often omits total_count because counting all rows requires a full table scan — only provide it when the UI genuinely needs it.
Idempotency in REST — what it really means for retries
Idempotency means calling an operation once or N times produces the same server state. This is what makes retries safe. GET, PUT, and DELETE are idempotent. POST is not — retrying a POST that timed out may create a duplicate.
# Client sends a POST to create an order:
POST /orders HTTP/1.1
Host: api.example.com
Authorization: Bearer tok_...
Idempotency-Key: 7f3d9c2a-4b1e-48f0-b8c3-2d5e7a9f1b0c ← UUID the client generates
{"product_id": "prod_abc", "quantity": 2}
← HTTP/1.1 201 Created
← Location: /orders/ord_888
# Network drops. Client retries with the SAME Idempotency-Key:
POST /orders HTTP/1.1
Idempotency-Key: 7f3d9c2a-4b1e-48f0-b8c3-2d5e7a9f1b0c ← same key
← HTTP/1.1 200 OK ← 200 not 201 — server recognises it
← Location: /orders/ord_888 ← same resource, no duplicate created
This pattern is used by Stripe for payments, Twilio for SMS, and most financial APIs where duplicate operations are costly. The server stores (key → result) in a cache or database. Stripe recommends the client generate a UUID per "intent," not per attempt.
If clients can decode your cursor and see {"id": 1042}, some will start constructing cursors manually or relying on the id value in their own logic. Base64 is not encryption — use an opaque token (such as an HMAC-signed cursor) if clients must not be able to infer the underlying sort key.
How to debug & inspect it
Pagination bugs are among the most common REST API issues — duplicate items, missing items, incorrect total counts, and unexpectedly slow requests at high page numbers. The tools that expose them are curl, the Link header, and database slow-query logs.
| Symptom | Cause | Fix |
|---|---|---|
| Items appear on two consecutive pages | Offset drift — a new item was inserted between the two requests, shifting the window | Switch to cursor pagination; cursors anchor to a specific row regardless of inserts |
| An item is missing from the paginated results entirely | Offset drift in the other direction — an item was deleted, causing a gap | Cursor pagination; or accept minor inconsistency and document it |
| Late pages are 10–100× slower than early pages | Deep offset — database scanning and discarding all prior rows before each page | Migrate to keyset/cursor pagination; add a composite index on (sort_col, id) |
next_cursor in response is the same as the current cursor | Server bug — cursor not advancing; infinite loop risk for clients following Link: next | Verify the cursor encodes the last item of the current page, not the first |
Client gets 400 on a valid-looking cursor | Cursor expired, client modified the base64 value, or schema change made the cursor invalid | Treat cursors as ephemeral (add a TTL); return a helpful error message; never parse cursors client-side |
POST without Idempotency-Key creates duplicates on retry | Network timeout caused client to retry; server treated each request as new | Add Idempotency-Key support server-side; document its use for all state-changing POST operations |
Debug checklist:
- Check for a
Linkheader in paginated responses — it should contain at minimumrel="next". If absent, clients must construct URLs manually, creating coupling. - Walk two pages and diff the IDs — any ID overlap or gap indicates offset drift.
- Benchmark page 1 vs page 100 vs page 1000 response times. If they grow linearly you have a deep offset problem.
- For cursor APIs: confirm the cursor advances each page (
page N cursor !== page N+1 cursor). Confirm the client usesLink: nextinstead of constructing cursors. - For POST with
Idempotency-Key: replay the exact request with the same key and verify the server returns the same result without creating a duplicate resource.
In production: how leading APIs do it
Stripe, GitHub, and Slack each paginate large collections, but they made different design choices that reflect the REST maturity level they were aiming for and the access patterns their consumers actually exhibit.
| Provider | Pagination style | Parameters | Navigation signal | REST angle | Docs |
|---|---|---|---|---|---|
| Stripe | Cursor (object-ID based) | limit + starting_after or ending_before (an object id) |
has_more: true/false in the response body |
L2 with deliberate cursor design; no Link header; cursor is a plain object ID (not encoded), making it easy to inspect but coupling clients to the id field | Stripe API pagination |
| GitHub | Offset (page + per_page) |
page (1-based) + per_page (max 100) |
RFC 5988 Link header with rel="next", rel="prev", rel="first", rel="last" |
Closest to L3 in this comparison — hypermedia links in the response let clients navigate without constructing URLs; offset is simpler for repo browsing where random-page access matters | GitHub REST pagination |
| Slack | Cursor (opaque token) | cursor + optional limit |
response_metadata.next_cursor in the response body; empty string when exhausted |
L2; cursor is a fully opaque token (not an object id), matching the ideal described in this lesson — clients can never decode or construct it | Slack API pagination |
Deep dive: why cursors win for large sets — and when offset is still the right call
Stripe's starting_after and Slack's next_cursor are both cursor-based, but they expose different levels of abstraction to callers. Stripe's cursor is a plain object ID — you pass starting_after=ch_3NkL... and Stripe issues WHERE id > 'ch_3NkL...' LIMIT n against its charges table. Clients can see the id and reason about it, but they are also coupled to it: if Stripe ever changed the sort key, the cursor format would break. Slack's cursor is an opaque token whose internal structure is undocumented — clients echo it back but cannot interpret it, which gives Slack the freedom to change internal sort keys without a breaking change.
GitHub's offset approach — page=3&per_page=30 — is a deliberate concession to use cases where random-page access matters. Browsing repository issues where a user wants to jump to "page 5 of 12" requires offset semantics; a cursor-only API would force the client to walk the first four pages first. The RFC 5988 Link header compensates for the navigability problem by giving clients machine-readable next/prev/first/last URLs so they do not need to construct page parameters manually. For access patterns that are genuinely sequential — infinite scroll, background sync, data export — cursor pagination is the only choice that remains fast as the dataset grows. For access patterns that require random-page access in a bounded, low-write dataset, offset with a Link header (the GitHub approach) remains a pragmatic choice.
🧠 Quick check
1. You create a new user successfully with POST /users. The correct HTTP status is:
201 Created tells the client a resource was created. The Location header gives the URL of the new resource so the client can fetch it without guessing. 200 is for successful reads/updates; 204 is for no-body responses (typically DELETE).
2. An infinite-scroll feed at high volume. Which pagination strategy is better?
Cursor pagination is the right fit: constant-cost indexed seek, no duplicate/missing records when new items arrive, and it maps naturally onto "load more." Offset pagination degrades as the offset grows and produces inconsistent results in live feeds.
3. Where do filtering and sorting parameters belong?
Query parameters are the right vehicle for optional modifiers on a collection. Path segments identify resources; when category is a filter rather than a distinct resource, it belongs in the query string. GET bodies are technically permitted but widely unsupported.
4. An L0 API on the Richardson Maturity Model uses:
L0 is the swamp — one URI, one method (usually POST), action encoded in the body or query string. It uses HTTP as a transport tunnel, not as a semantic layer.
✍️ Exercise: critique and fix a real-world API design (try before opening)
A colleague proposes this API surface for an e-commerce platform. Identify as many issues as you can and propose fixes:
Issues found:
- Verbs in URLs (
getProductList,create,delete,fetchAll,updateOrderStatus) - Inconsistent naming — singular
/productvs plural implied by list - Using POST for a delete operation (ignores HTTP method semantics)
- Action in a query param (
action=fetchAll) instead of method
Fixed design:
Rubric: ✓ verbs removed from URLs ✓ plural nouns used ✓ DELETE used for deletion ✓ PATCH used for partial update ✓ status as a field in the body, not a URL action.
Key takeaways
- The Richardson Maturity Model (L0–L3) is a practical ladder for measuring REST compliance; L2 is the common production target.
- Resource names: plural nouns, max two nesting levels, filters in query params.
- Status codes carry semantic meaning — use
201 Created,204 No Content,409 Conflictprecisely. - Cursor pagination scales better than offset for large or live datasets.
- A consistent error shape (RFC 7807 style) lets clients write error handling once.