Architectural Styles · Lesson 06

Comparing REST, GraphQL, and gRPC

You've seen each style in isolation. Now lay them side by side and build the decision instinct you need for the real question every architect faces: which style goes here, and why?

⏱ 15 min Difficulty: core Prereq: as-01 through as-05

By the end you'll be able to

Read the decision table and map any API requirement to the most fitting style.
Apply "use X when…" guidance to three different scenario types: public API, mobile app, internal microservices.
Justify a multi-style architecture for a single product.

The decision table

Each cell below is a short verdict — read it as "how well does this style handle this dimension?" rather than a strict score. No style wins every dimension.

Dimension	REST	GraphQL	gRPC
Typical clients	Browsers, third-party apps, CLIs	First-party mobile/web apps needing tailored payloads	Internal services, microservice mesh
Performance	Good — JSON/HTTP/1.1 or HTTP/2, well-understood	Good — one round-trip; overhead in query parsing & resolver orchestration	Excellent — binary protobuf + HTTP/2 multiplexing
Caching	Built-in via HTTP (CDN, ETag, Cache-Control)	Manual — persisted queries, client normalised cache, schema hints	Not built-in — must implement at the application layer
Streaming	Server-Sent Events or WebSocket (add-on)	Subscriptions (WebSocket); well-supported in major clients	First-class — 4 streaming modes built into the protocol
Type contract	Optional (OpenAPI, JSON Schema)	Mandatory — SDL schema is the contract	Mandatory — .proto file; compile-time enforcement
Tooling	Universal — curl, Postman, every HTTP library	GraphiQL, Apollo Studio, Relay — rich but specialised	grpcurl, Buf, generated stubs — capable but niche
Debuggability	Excellent — readable JSON in any browser DevTools	Good — JSON responses; errors embedded in body	Difficult — binary wire format; needs grpcurl + proto file
Learning curve	Low — built on HTTP vocabulary most devs know	Medium — SDL, resolvers, N+1 pitfalls	Medium-high — protobuf, codegen, HTTP/2 mental model
Versioning	URL versioning (/v1/) or header; well-established patterns	Additive schema evolution; deprecation directives	Additive field evolution; reserved field numbers for removals
Browser support	Native	Native (over HTTP/1.1)	Requires grpc-web proxy

"Use X when…" guidance

Use REST when

Your API is public or consumed by third parties who should not need special tooling.
The resource model is stable and response shapes don't vary wildly across clients.
You need HTTP caching to work transparently (CDN edge, browser cache).
The team is small or less experienced with specialised tooling — REST has the lowest onboarding cost.
You need humans to curl endpoints during debugging.

Use GraphQL when

Multiple first-party clients (iOS, Android, web) need different subsets of the same underlying data.
The front-end evolves rapidly and you want to add screen fields without back-end deploys.
Over-fetching is costing real bandwidth on mobile connections.
You're willing to invest in DataLoader, persisted queries, and schema governance tooling.

Use gRPC when

The connection is internal — both client and server are under your control.
You need maximum throughput and minimum latency at high call volumes.
You want compile-time contract enforcement across a polyglot service mesh (Go backend, Java service, Python ML worker).
You need native streaming (server push, client upload, bidi) without bolt-on protocols.
Browsers are not direct consumers (or you'll add grpc-web and its proxy).

A simplified decision path. Real decisions involve more dimensions (team experience, existing infrastructure, browser reach) — this is a starting heuristic, not a rule.

Coexistence — a realistic architecture

Mature products typically use all three styles, each at the layer it fits best:

Scenario

A fintech platform's API surface might look like this:

# Public developer API — REST
# Partners integrate via stable, versioned, curl-friendly endpoints
api.fintech.com/v2/payments
api.fintech.com/v2/accounts
api.fintech.com/v2/webhooks

# Mobile / web BFF — GraphQL
# iOS and Android declare exactly what each screen needs
bff.fintech.com/graphql

# Internal service mesh — gRPC
# Risk engine, fraud scorer, ledger — high-frequency internal calls
risk.internal:50051      # rpc Score(Transaction) returns (RiskResult)
ledger.internal:50052    # rpc Debit/Credit streaming
fraud.internal:50053     # bidi streaming for real-time signals

The styles don't compete — they occupy distinct tiers. The REST API is the public face; GraphQL is the BFF (Backend for Frontend) that absorbs client churn; gRPC is the performance backbone hidden from external callers entirely.

🎯 Interview angle

"Pick a style for each of these three scenarios: (a) a public payments API used by hundreds of partners, (b) a mobile app with dozens of heterogeneous screens, (c) a cluster of internal microservices exchanging 50,000 messages per second."

Strong answer: (a) REST — partner-friendly, cacheable, stable contracts; (b) GraphQL — each screen declares its fields, no endpoint proliferation; (c) gRPC — binary efficiency + HTTP/2 + compile-time contract enforcement across languages. Bonus: mention that (a) and (b) could be the same company using both styles at the same time.

⚠️ Common trap — picking one style for everything

The biggest architectural mistake is cargo-culting a style across the entire organisation. "We're a GraphQL shop" leads to: public APIs that are hard for partners to curl, internal services that parse JSON when they could use binary, and browser clients that can't use a gRPC-only BFF. Each boundary has different constraints — honour them.

✅ Do this, not that

Do document the style decision for each interface boundary in an Architecture Decision Record (ADR) — it forces you to state the trade-offs explicitly and makes future changes intentional. Don't let style choice drift into tribal preference ("we've always done REST") — the cost of the wrong style compounds over years of client integration.

Under the hood: a worked decision example

Abstract trade-off tables are useful — but the skill is applying them to a specific constraint set. Here is a single decision worked in detail across three concrete scenarios, with numbers to back each choice.

Scenario A — partner payment API (external, stable contract, high third-party diversity)

Constraints

500 partner companies integrate. They use PHP, Ruby, Java, Python, and shell scripts. Partners want to curl endpoints to test. Responses are read-only GET for the vast majority; webhooks push state changes out. Average payload: a payment record (~1.2 KB JSON).

Dimension	REST	GraphQL	gRPC
Partner tooling cost	Zero — any HTTP client works	Medium — must learn SDL, install a GraphQL client	High — must install grpcurl or generated stubs; binary not curl-able
Payload size (1.2 KB JSON)	1.2 KB + HTTP/1.1 headers (~400 B overhead)	Same JSON payload; POST body adds query overhead (~200 B extra)	~300–400 B protobuf + HTTP/2 headers (~80 B HPACK compressed) — 3× smaller
HTTP caching for GET /payments/{id}	Yes — CDN caches by URL + headers	No — all POST; requires persisted queries	No — application must implement
Documentation cost	Low — OpenAPI + familiar HTTP concepts	Medium — introspection helps, but SDL is a new concept for many	High — binary format requires generating docs from .proto

Decision: REST. The payload-size advantage of gRPC is real (~3× smaller) but at 1.2 KB it saves ~800 bytes per request — 0.8 ms at 10 Mbps. The tooling cost imposed on 500 external partners is measured in days of engineering effort, not milliseconds. HTTP caching eliminates read load on the origin. REST wins on total cost of ecosystem.

Scenario B — mobile BFF for a social feed (first-party, highly variable screen shapes)

Constraints

iOS and Android apps. List view needs 4 fields per post; detail view needs 12 fields; profile view needs 7 different fields from the same object. Apps are updated weekly; backend team is small. Average feed: 20 items.

Dimension	REST (with ?fields=)	GraphQL	gRPC
Payload for list view (4 of 20 fields)	With `?fields=`: ~40% of full payload — ~480 B	Client declares exactly 4 fields: ~480 B, no server deploy needed	~180 B protobuf — smallest; but requires a proto change for each new field set
New screen field additions	Needs server deploy if field doesn't exist	No server deploy — client adds field to query	Needs proto schema change + code regeneration + deploy on server and both app stores
N+1 problem	3 sequential REST calls for heterogeneous screen	One query, resolved server-side	3 sequential unary calls unless you build a composite RPC
Real-time updates (feed new posts)	Polling or SSE (add-on)	Subscriptions — first-class	Server streaming — first-class, but browser needs grpc-web

Decision: GraphQL. The weekly iteration cycle means the front-end team constantly adds fields. With GraphQL they do it without waiting for a backend deploy. The payload savings over ?fields= are small; the latency saving of avoiding 3 sequential calls is real (3 × 60 ms RTT = 180 ms eliminated). GraphQL subscriptions are the right primitive for real-time feed updates on a browser/mobile client.

Scenario C — internal fraud scoring (internal, high-frequency, polyglot)

Constraints

Payment gateway (Go) calls fraud scorer (Python) on every transaction. Volume: 8,000 calls/second. Payload: a transaction record (~400 bytes). Latency budget: 20 ms. Both services owned by the same team. No browser client involved.

Dimension	REST (JSON/HTTP1.1)	gRPC (protobuf/HTTP2)
Payload size (400 B record)	~400 B JSON	~120 B protobuf — 3× smaller, 3× less CPU to parse at 8k req/s
Connection overhead at 8k req/s	HTTP/1.1: ~8k TCP connections or keep-alive pool overhead	HTTP/2 multiplexing: dozens of streams on one connection — near-zero connection overhead
CPU at 8k req/s (parse + serialise)	JSON marshal/unmarshal ~2–5 µs per message in Go	Protobuf marshal/unmarshal ~0.3–0.8 µs — 5–10× faster
Contract enforcement	Optional — a JSON field rename is a silent runtime bug	Compile-time — generated stubs fail to compile if interface breaks

Decision: gRPC. At 8k req/s and a 20 ms budget, the numbers matter. The per-message CPU savings of protobuf (~4 µs × 8k = 32 ms/s of CPU freed per core) are real. HTTP/2 multiplexing eliminates connection-pool churn. The Go ↔ Python polyglot boundary benefits from generated stubs catching contract breaks before they reach production.

Quick benchmarking note — how to validate your choice with a measurement

Trade-off reasoning is a starting point; a 30-minute measurement closes the loop. Here is the minimal toolkit:

# 1. Measure payload size — compare JSON vs protobuf for the same record $ curl -s https://api.example.com/v1/transactions/42 | wc -c 1247 # vs grpcurl output (pipe to protoc --decode to get the binary size) $ grpcurl -plaintext -d '{"tx_id":"42"}' svc:50051 tx.TxService/GetTx 2>&1 | wc -c 394 # Note: grpcurl output is JSON-decoded text; measure the raw wire bytes # with tcpdump -i lo0 -w capture.pcap and inspect DATA frame length in Wireshark # 2. Measure latency + throughput with wrk (HTTP) and ghz (gRPC) $ wrk -t4 -c100 -d30s https://api.example.com/v1/transactions/42 Requests/sec: 4312.88 Latency p99: 48.2ms $ ghz --insecure --proto tx.proto --call tx.TxService.GetTx \ -d '{"tx_id":"42"}' -c 100 -n 10000 svc:50051 Requests/sec: 18472.14 Latency p99: 11.8ms # 3. Measure CPU during the load test (Go pprof endpoint) $ curl http://localhost:6060/debug/pprof/profile?seconds=10 -o cpu.pprof $ go tool pprof -top cpu.pprof | grep -E "encoding|json|proto" # Compare JSON marshal time vs protobuf marshal time in the pprof flamegraph

The three things to measure are: payload bytes on the wire (tcpdump + Wireshark DATA frame size), end-to-end latency and throughput (wrk for HTTP, ghz for gRPC, under realistic concurrency), and CPU cost per request (pprof or equivalent). If the numbers align with the trade-off reasoning, commit. If they don't, the data wins.

How to debug & inspect it (validating your choice)

Before committing to a style for a new boundary, a 30-minute spike can surface surprises. Here are the exact commands to measure the three dimensions that usually drive the decision.

# --- Validate REST payload size + latency --- $ curl -o /dev/null -sw "%{size_download} bytes %{time_total}s\n" \ https://api.example.com/v1/users/42 1247 bytes 0.041s # Check what a sparse fieldset saves you $ curl -o /dev/null -sw "%{size_download} bytes\n" \ "https://api.example.com/v1/users/42?fields=id,name,avatar_url" 112 bytes # --- Validate HTTP caching is actually working --- $ curl -I https://api.example.com/v1/users/42 | grep -i "cache\|etag\|age" Cache-Control: public, max-age=60 ETag: "a3f8c" # A CDN HIT should show Age: >0 and X-Cache: HIT on the second call # --- Validate GraphQL field trimming reduces payload --- $ curl -X POST https://bff.example.com/graphql \ -H "Content-Type: application/json" \ -d '{"query":"{ user(id:\"42\") { id name avatarUrl } }"}' \ | wc -c 89 # vs full-record query: compare the two numbers # --- Validate gRPC throughput vs REST at concurrency --- $ ghz --insecure --proto user.proto --call user.UserService.GetUser \ -d '{"user_id":"42"}' -c 50 -n 5000 --duration 30s user-svc:50051 Summary: Count: 5000 Requests/sec: 9841.22 Fastest: 1.2ms Slowest: 18.4ms Mean: 5.1ms p99: 14.2ms

Symptom	Likely cause	Fix
REST is slower than expected despite lower concurrency	HTTP/1.1 head-of-line blocking; too many TCP connections per client	Enable HTTP/2 on the server; use connection keep-alive; or switch to gRPC for this boundary
GraphQL responses are large despite requesting few fields	Resolver fetches the full object before trimming; N+1 — many sub-resolvers each add fields	Apply DataLoader; push field selection into the database query layer
gRPC payload is not smaller than JSON	Most fields are strings (protobuf has no special string compression — it stores UTF-8 bytes just like JSON)	Add compression (`grpc-encoding: gzip`); or accept the trade-off — protobuf's advantage is more about parse speed than string-heavy payload size
CDN hit rate is 0% for REST	Responses include `Cache-Control: no-store` or are POST requests	Audit the Cache-Control header; switch mutation endpoints to POST and read endpoints to GET
GraphQL adds visible latency vs equivalent REST	N+1 resolver problem — each related object triggers a database round-trip	Profile with `opentelemetry` traces; add DataLoader batching for relational fields

Decision-validation checklist:

Measure actual payload size on the wire (not the JSON representation) — use curl -sw "%{size_download}" or tcpdump.
Run a load test at realistic concurrency — wrk for HTTP, ghz for gRPC — and compare p99 latency, not just mean.
Check cache hit rates in CDN or proxy access logs — a 0% hit rate on cacheable REST endpoints is a misconfiguration.
Profile server CPU under load to confirm serialisation is not the bottleneck (it usually isn't until very high call rates).
If a style change is being evaluated, run both old and new side by side on production shadow traffic before committing.

🧠 Quick check

1. A startup launches a developer platform for third-party integrations. Which style minimises the barrier to entry for external developers?

REST wins for external developer experience: every language has HTTP client libraries, curl works, documentation is easy to write, and there's no special toolchain to install. GraphQL and gRPC both require additional tooling investment from the integrating party.

2. Which style requires the most explicit work to add caching support?

GraphQL is POST-based with no per-query URL, so standard HTTP caching doesn't apply at all. You must implement persisted queries, client-side normalised caches, or schema-level cache hints explicitly. gRPC also lacks built-in caching, but GraphQL's additional complexity around partial field overlap makes its caching problem especially nuanced.

3. An internal analytics pipeline needs to stream millions of events per second between Go and Java services. The team wants compile-time contract guarantees. Best fit?

This is gRPC's home territory: internal, polyglot, high-frequency, streaming, needs compile-time safety. REST over JSON would use more bandwidth and CPU; GraphQL subscriptions are designed for client-initiated real-time reads, not high-throughput internal pipelines.

4. A new iOS screen needs exactly 4 fields from 3 different REST endpoints. The back-end team is busy. Best architectural choice?

Option A avoids N API calls and avoids the back-end bottleneck. Option B wastes bandwidth and battery (3 trips, mostly wasted fields). Option C is the "backend for frontend" REST approach — viable but requires a back-end deployment for every new screen, which the prompt said is blocked.

✍️ Exercise A: design the API layer for a logistics platform (try before opening)

A logistics company has: (1) a partner portal for shipping companies to submit and track shipments; (2) a driver mobile app that needs live tracking updates and receives only the fields relevant to the current delivery; (3) a fleet-routing engine that exchanges thousands of route-optimisation messages per second with the dispatch service (both internal Go services).

Design the API layer. For each boundary, name the style and give two reasons for the choice.

Model answer:

Partner portal → REST. Reasons: (1) external partners need simple, curl-able, versioned endpoints with standard HTTP caching for tracking status; (2) the resource model (shipments, waypoints) maps cleanly to nouns + CRUD methods.
Driver mobile app → GraphQL. Reasons: (1) the app is a first-party client that needs different field subsets per screen (list view vs detail view vs active delivery view), avoiding over-fetching on limited mobile data; (2) subscriptions deliver real-time tracking updates without polling.
Fleet routing engine → gRPC. Reasons: (1) thousands of messages per second between internal services — binary protobuf + HTTP/2 multiplexing minimises per-message overhead; (2) both services are owned by the same team, so generated stubs enforce the contract at compile time, preventing schema drift.

Rubric: ✓ correct style for each boundary ✓ two distinct reasons each (not just "it's better") ✓ no style applied where it doesn't fit ✓ total solution is consistent (e.g., does not put gRPC at the public partner boundary).

✍️ Exercise B: critique a mixed-style architecture (try before opening)

A team reports: "We use GraphQL for our public API, REST for our mobile BFF, and REST for our internal microservices." Identify what's potentially backwards and suggest corrections.

Model answer:

GraphQL for the public API is unusual — external partners must install GraphQL client tooling, understand SDL, and cannot use simple curl. REST is usually more partner-friendly. GraphQL is better suited to first-party clients. (Exception: if the public API is developer-facing and complex data traversal is genuinely needed, GraphQL can work — but requires excellent documentation and tooling investment.)
REST for the mobile BFF is fine, but the team is missing GraphQL's main strength — letting mobile clients declare their field needs. If all mobile screens need the same data shape this is acceptable; if screens vary significantly, they're probably already hand-crafting many custom REST endpoints.
REST for internal microservices — if these services exchange high-frequency messages and are co-owned, gRPC would reduce bandwidth and enforce contracts at compile time. REST is acceptable here for lower-frequency services.

Rubric: ✓ identifies the mismatch of GraphQL for external use ✓ recognises REST-for-BFF as a missed opportunity if field flexibility is needed ✓ suggests gRPC for internal if frequency warrants it ✓ does not simply say "everything is wrong" without nuance.

Key takeaways

REST: public APIs, third-party integrations, HTTP caching, universal tooling, low onboarding cost.
GraphQL: first-party mobile/web clients needing field flexibility, rapid front-end iteration, real-time subscriptions.
gRPC: internal microservices, high-throughput binary messaging, polyglot compile-time contracts, native streaming.
Production systems routinely use all three — the choice is per interface boundary, not per organisation.
Document your style choices explicitly (ADR) — undocumented decisions calcify into tribal dogma.

Comparing REST, GraphQL, and gRPC

By the end you'll be able to

The decision table

"Use X when…" guidance

Use REST when

Use GraphQL when

Use gRPC when

Coexistence — a realistic architecture

Under the hood: a worked decision example

Scenario A — partner payment API (external, stable contract, high third-party diversity)

Scenario B — mobile BFF for a social feed (first-party, highly variable screen shapes)

Scenario C — internal fraud scoring (internal, high-frequency, polyglot)

Quick benchmarking note — how to validate your choice with a measurement

How to debug & inspect it (validating your choice)

🧠 Quick check

Key takeaways

Sources & further reading