Architectural Styles · Lesson 06
Comparing REST, GraphQL, and gRPC
You've seen each style in isolation. Now lay them side by side and build the decision instinct you need for the real question every architect faces: which style goes here, and why?
By the end you'll be able to
- Read the decision table and map any API requirement to the most fitting style.
- Apply "use X when…" guidance to three different scenario types: public API, mobile app, internal microservices.
- Justify a multi-style architecture for a single product.
The decision table
Each cell below is a short verdict — read it as "how well does this style handle this dimension?" rather than a strict score. No style wins every dimension.
| Dimension | REST | GraphQL | gRPC |
|---|---|---|---|
| Typical clients | Browsers, third-party apps, CLIs | First-party mobile/web apps needing tailored payloads | Internal services, microservice mesh |
| Performance | Good — JSON/HTTP/1.1 or HTTP/2, well-understood | Good — one round-trip; overhead in query parsing & resolver orchestration | Excellent — binary protobuf + HTTP/2 multiplexing |
| Caching | Built-in via HTTP (CDN, ETag, Cache-Control) | Manual — persisted queries, client normalised cache, schema hints | Not built-in — must implement at the application layer |
| Streaming | Server-Sent Events or WebSocket (add-on) | Subscriptions (WebSocket); well-supported in major clients | First-class — 4 streaming modes built into the protocol |
| Type contract | Optional (OpenAPI, JSON Schema) | Mandatory — SDL schema is the contract | Mandatory — .proto file; compile-time enforcement |
| Tooling | Universal — curl, Postman, every HTTP library | GraphiQL, Apollo Studio, Relay — rich but specialised | grpcurl, Buf, generated stubs — capable but niche |
| Debuggability | Excellent — readable JSON in any browser DevTools | Good — JSON responses; errors embedded in body | Difficult — binary wire format; needs grpcurl + proto file |
| Learning curve | Low — built on HTTP vocabulary most devs know | Medium — SDL, resolvers, N+1 pitfalls | Medium-high — protobuf, codegen, HTTP/2 mental model |
| Versioning | URL versioning (/v1/) or header; well-established patterns | Additive schema evolution; deprecation directives | Additive field evolution; reserved field numbers for removals |
| Browser support | Native | Native (over HTTP/1.1) | Requires grpc-web proxy |
"Use X when…" guidance
Use REST when
- Your API is public or consumed by third parties who should not need special tooling.
- The resource model is stable and response shapes don't vary wildly across clients.
- You need HTTP caching to work transparently (CDN edge, browser cache).
- The team is small or less experienced with specialised tooling — REST has the lowest onboarding cost.
- You need humans to curl endpoints during debugging.
Use GraphQL when
- Multiple first-party clients (iOS, Android, web) need different subsets of the same underlying data.
- The front-end evolves rapidly and you want to add screen fields without back-end deploys.
- Over-fetching is costing real bandwidth on mobile connections.
- You're willing to invest in DataLoader, persisted queries, and schema governance tooling.
Use gRPC when
- The connection is internal — both client and server are under your control.
- You need maximum throughput and minimum latency at high call volumes.
- You want compile-time contract enforcement across a polyglot service mesh (Go backend, Java service, Python ML worker).
- You need native streaming (server push, client upload, bidi) without bolt-on protocols.
- Browsers are not direct consumers (or you'll add grpc-web and its proxy).
Coexistence — a realistic architecture
Mature products typically use all three styles, each at the layer it fits best:
A fintech platform's API surface might look like this:
# Public developer API — REST
# Partners integrate via stable, versioned, curl-friendly endpoints
api.fintech.com/v2/payments
api.fintech.com/v2/accounts
api.fintech.com/v2/webhooks
# Mobile / web BFF — GraphQL
# iOS and Android declare exactly what each screen needs
bff.fintech.com/graphql
# Internal service mesh — gRPC
# Risk engine, fraud scorer, ledger — high-frequency internal calls
risk.internal:50051 # rpc Score(Transaction) returns (RiskResult)
ledger.internal:50052 # rpc Debit/Credit streaming
fraud.internal:50053 # bidi streaming for real-time signals
The styles don't compete — they occupy distinct tiers. The REST API is the public face; GraphQL is the BFF (Backend for Frontend) that absorbs client churn; gRPC is the performance backbone hidden from external callers entirely.
"Pick a style for each of these three scenarios: (a) a public payments API used by hundreds of partners, (b) a mobile app with dozens of heterogeneous screens, (c) a cluster of internal microservices exchanging 50,000 messages per second."
Strong answer: (a) REST — partner-friendly, cacheable, stable contracts; (b) GraphQL — each screen declares its fields, no endpoint proliferation; (c) gRPC — binary efficiency + HTTP/2 + compile-time contract enforcement across languages. Bonus: mention that (a) and (b) could be the same company using both styles at the same time.
The biggest architectural mistake is cargo-culting a style across the entire organisation. "We're a GraphQL shop" leads to: public APIs that are hard for partners to curl, internal services that parse JSON when they could use binary, and browser clients that can't use a gRPC-only BFF. Each boundary has different constraints — honour them.
Do document the style decision for each interface boundary in an Architecture Decision Record (ADR) — it forces you to state the trade-offs explicitly and makes future changes intentional. Don't let style choice drift into tribal preference ("we've always done REST") — the cost of the wrong style compounds over years of client integration.
Under the hood: a worked decision example
Abstract trade-off tables are useful — but the skill is applying them to a specific constraint set. Here is a single decision worked in detail across three concrete scenarios, with numbers to back each choice.
Scenario A — partner payment API (external, stable contract, high third-party diversity)
500 partner companies integrate. They use PHP, Ruby, Java, Python, and shell scripts. Partners want to curl endpoints to test. Responses are read-only GET for the vast majority; webhooks push state changes out. Average payload: a payment record (~1.2 KB JSON).
| Dimension | REST | GraphQL | gRPC |
|---|---|---|---|
| Partner tooling cost | Zero — any HTTP client works | Medium — must learn SDL, install a GraphQL client | High — must install grpcurl or generated stubs; binary not curl-able |
| Payload size (1.2 KB JSON) | 1.2 KB + HTTP/1.1 headers (~400 B overhead) | Same JSON payload; POST body adds query overhead (~200 B extra) | ~300–400 B protobuf + HTTP/2 headers (~80 B HPACK compressed) — 3× smaller |
| HTTP caching for GET /payments/{id} | Yes — CDN caches by URL + headers | No — all POST; requires persisted queries | No — application must implement |
| Documentation cost | Low — OpenAPI + familiar HTTP concepts | Medium — introspection helps, but SDL is a new concept for many | High — binary format requires generating docs from .proto |
Decision: REST. The payload-size advantage of gRPC is real (~3× smaller) but at 1.2 KB it saves ~800 bytes per request — 0.8 ms at 10 Mbps. The tooling cost imposed on 500 external partners is measured in days of engineering effort, not milliseconds. HTTP caching eliminates read load on the origin. REST wins on total cost of ecosystem.
Scenario B — mobile BFF for a social feed (first-party, highly variable screen shapes)
iOS and Android apps. List view needs 4 fields per post; detail view needs 12 fields; profile view needs 7 different fields from the same object. Apps are updated weekly; backend team is small. Average feed: 20 items.
| Dimension | REST (with ?fields=) | GraphQL | gRPC |
|---|---|---|---|
| Payload for list view (4 of 20 fields) | With ?fields=: ~40% of full payload — ~480 B | Client declares exactly 4 fields: ~480 B, no server deploy needed | ~180 B protobuf — smallest; but requires a proto change for each new field set |
| New screen field additions | Needs server deploy if field doesn't exist | No server deploy — client adds field to query | Needs proto schema change + code regeneration + deploy on server and both app stores |
| N+1 problem | 3 sequential REST calls for heterogeneous screen | One query, resolved server-side | 3 sequential unary calls unless you build a composite RPC |
| Real-time updates (feed new posts) | Polling or SSE (add-on) | Subscriptions — first-class | Server streaming — first-class, but browser needs grpc-web |
Decision: GraphQL. The weekly iteration cycle means the front-end team constantly adds fields. With GraphQL they do it without waiting for a backend deploy. The payload savings over ?fields= are small; the latency saving of avoiding 3 sequential calls is real (3 × 60 ms RTT = 180 ms eliminated). GraphQL subscriptions are the right primitive for real-time feed updates on a browser/mobile client.
Scenario C — internal fraud scoring (internal, high-frequency, polyglot)
Payment gateway (Go) calls fraud scorer (Python) on every transaction. Volume: 8,000 calls/second. Payload: a transaction record (~400 bytes). Latency budget: 20 ms. Both services owned by the same team. No browser client involved.
| Dimension | REST (JSON/HTTP1.1) | gRPC (protobuf/HTTP2) |
|---|---|---|
| Payload size (400 B record) | ~400 B JSON | ~120 B protobuf — 3× smaller, 3× less CPU to parse at 8k req/s |
| Connection overhead at 8k req/s | HTTP/1.1: ~8k TCP connections or keep-alive pool overhead | HTTP/2 multiplexing: dozens of streams on one connection — near-zero connection overhead |
| CPU at 8k req/s (parse + serialise) | JSON marshal/unmarshal ~2–5 µs per message in Go | Protobuf marshal/unmarshal ~0.3–0.8 µs — 5–10× faster |
| Contract enforcement | Optional — a JSON field rename is a silent runtime bug | Compile-time — generated stubs fail to compile if interface breaks |
Decision: gRPC. At 8k req/s and a 20 ms budget, the numbers matter. The per-message CPU savings of protobuf (~4 µs × 8k = 32 ms/s of CPU freed per core) are real. HTTP/2 multiplexing eliminates connection-pool churn. The Go ↔ Python polyglot boundary benefits from generated stubs catching contract breaks before they reach production.
Quick benchmarking note — how to validate your choice with a measurement
Trade-off reasoning is a starting point; a 30-minute measurement closes the loop. Here is the minimal toolkit:
The three things to measure are: payload bytes on the wire (tcpdump + Wireshark DATA frame size), end-to-end latency and throughput (wrk for HTTP, ghz for gRPC, under realistic concurrency), and CPU cost per request (pprof or equivalent). If the numbers align with the trade-off reasoning, commit. If they don't, the data wins.
How to debug & inspect it (validating your choice)
Before committing to a style for a new boundary, a 30-minute spike can surface surprises. Here are the exact commands to measure the three dimensions that usually drive the decision.
| Symptom | Likely cause | Fix |
|---|---|---|
| REST is slower than expected despite lower concurrency | HTTP/1.1 head-of-line blocking; too many TCP connections per client | Enable HTTP/2 on the server; use connection keep-alive; or switch to gRPC for this boundary |
| GraphQL responses are large despite requesting few fields | Resolver fetches the full object before trimming; N+1 — many sub-resolvers each add fields | Apply DataLoader; push field selection into the database query layer |
| gRPC payload is not smaller than JSON | Most fields are strings (protobuf has no special string compression — it stores UTF-8 bytes just like JSON) | Add compression (grpc-encoding: gzip); or accept the trade-off — protobuf's advantage is more about parse speed than string-heavy payload size |
| CDN hit rate is 0% for REST | Responses include Cache-Control: no-store or are POST requests | Audit the Cache-Control header; switch mutation endpoints to POST and read endpoints to GET |
| GraphQL adds visible latency vs equivalent REST | N+1 resolver problem — each related object triggers a database round-trip | Profile with opentelemetry traces; add DataLoader batching for relational fields |
Decision-validation checklist:
- Measure actual payload size on the wire (not the JSON representation) — use
curl -sw "%{size_download}"or tcpdump. - Run a load test at realistic concurrency — wrk for HTTP, ghz for gRPC — and compare p99 latency, not just mean.
- Check cache hit rates in CDN or proxy access logs — a 0% hit rate on cacheable REST endpoints is a misconfiguration.
- Profile server CPU under load to confirm serialisation is not the bottleneck (it usually isn't until very high call rates).
- If a style change is being evaluated, run both old and new side by side on production shadow traffic before committing.
🧠 Quick check
1. A startup launches a developer platform for third-party integrations. Which style minimises the barrier to entry for external developers?
REST wins for external developer experience: every language has HTTP client libraries, curl works, documentation is easy to write, and there's no special toolchain to install. GraphQL and gRPC both require additional tooling investment from the integrating party.
2. Which style requires the most explicit work to add caching support?
GraphQL is POST-based with no per-query URL, so standard HTTP caching doesn't apply at all. You must implement persisted queries, client-side normalised caches, or schema-level cache hints explicitly. gRPC also lacks built-in caching, but GraphQL's additional complexity around partial field overlap makes its caching problem especially nuanced.
3. An internal analytics pipeline needs to stream millions of events per second between Go and Java services. The team wants compile-time contract guarantees. Best fit?
This is gRPC's home territory: internal, polyglot, high-frequency, streaming, needs compile-time safety. REST over JSON would use more bandwidth and CPU; GraphQL subscriptions are designed for client-initiated real-time reads, not high-throughput internal pipelines.
4. A new iOS screen needs exactly 4 fields from 3 different REST endpoints. The back-end team is busy. Best architectural choice?
Option A avoids N API calls and avoids the back-end bottleneck. Option B wastes bandwidth and battery (3 trips, mostly wasted fields). Option C is the "backend for frontend" REST approach — viable but requires a back-end deployment for every new screen, which the prompt said is blocked.
✍️ Exercise A: design the API layer for a logistics platform (try before opening)
A logistics company has: (1) a partner portal for shipping companies to submit and track shipments; (2) a driver mobile app that needs live tracking updates and receives only the fields relevant to the current delivery; (3) a fleet-routing engine that exchanges thousands of route-optimisation messages per second with the dispatch service (both internal Go services).
Design the API layer. For each boundary, name the style and give two reasons for the choice.
Model answer:
- Partner portal → REST. Reasons: (1) external partners need simple, curl-able, versioned endpoints with standard HTTP caching for tracking status; (2) the resource model (shipments, waypoints) maps cleanly to nouns + CRUD methods.
- Driver mobile app → GraphQL. Reasons: (1) the app is a first-party client that needs different field subsets per screen (list view vs detail view vs active delivery view), avoiding over-fetching on limited mobile data; (2) subscriptions deliver real-time tracking updates without polling.
- Fleet routing engine → gRPC. Reasons: (1) thousands of messages per second between internal services — binary protobuf + HTTP/2 multiplexing minimises per-message overhead; (2) both services are owned by the same team, so generated stubs enforce the contract at compile time, preventing schema drift.
Rubric: ✓ correct style for each boundary ✓ two distinct reasons each (not just "it's better") ✓ no style applied where it doesn't fit ✓ total solution is consistent (e.g., does not put gRPC at the public partner boundary).
✍️ Exercise B: critique a mixed-style architecture (try before opening)
A team reports: "We use GraphQL for our public API, REST for our mobile BFF, and REST for our internal microservices." Identify what's potentially backwards and suggest corrections.
Model answer:
- GraphQL for the public API is unusual — external partners must install GraphQL client tooling, understand SDL, and cannot use simple curl. REST is usually more partner-friendly. GraphQL is better suited to first-party clients. (Exception: if the public API is developer-facing and complex data traversal is genuinely needed, GraphQL can work — but requires excellent documentation and tooling investment.)
- REST for the mobile BFF is fine, but the team is missing GraphQL's main strength — letting mobile clients declare their field needs. If all mobile screens need the same data shape this is acceptable; if screens vary significantly, they're probably already hand-crafting many custom REST endpoints.
- REST for internal microservices — if these services exchange high-frequency messages and are co-owned, gRPC would reduce bandwidth and enforce contracts at compile time. REST is acceptable here for lower-frequency services.
Rubric: ✓ identifies the mismatch of GraphQL for external use ✓ recognises REST-for-BFF as a missed opportunity if field flexibility is needed ✓ suggests gRPC for internal if frequency warrants it ✓ does not simply say "everything is wrong" without nuance.
Key takeaways
- REST: public APIs, third-party integrations, HTTP caching, universal tooling, low onboarding cost.
- GraphQL: first-party mobile/web clients needing field flexibility, rapid front-end iteration, real-time subscriptions.
- gRPC: internal microservices, high-throughput binary messaging, polyglot compile-time contracts, native streaming.
- Production systems routinely use all three — the choice is per interface boundary, not per organisation.
- Document your style choices explicitly (ADR) — undocumented decisions calcify into tribal dogma.