API Design

Architectural Styles · Lesson 05

The gRPC framework

gRPC treats a remote call like a local function call — but compiles the contract into type-safe client and server code, then sends everything as compact binary over HTTP/2. The result is a style purpose-built for internal service meshes that need speed and strict contracts.

⏱ 12 min Difficulty: core Prereq: Styles Overview (as-01)

By the end you'll be able to

RPC — the original idea

Remote Procedure Call (RPC) is the idea of calling a function on a different machine as if it were local. Your code writes payments.charge(amount, card); the network layer transparently serialises the arguments, sends them to the payments service, executes the function there, and returns the result. The network is meant to be invisible.

The concept dates to the 1970s. Early implementations (CORBA, XML-RPC, SOAP) were famously complex. gRPC, released by Google in 2016, took the core idea and rebuilt it with modern primitives: Protocol Buffers for the contract, HTTP/2 for the transport, and code generation for every popular language.

Protocol Buffers — the contract language

Everything in gRPC starts with a .proto file. It defines messages (typed data structures, like database rows with field numbers) and services (groups of RPC methods). The proto compiler (protoc) reads the file and generates client stubs and server skeletons in your language of choice.

// user_service.proto
syntax = "proto3";

// Message definitions
message GetUserRequest {
  string user_id = 1;
}

message User {
  string id    = 1;
  string name  = 2;
  string email = 3;
}

// Service definition
service UserService {
  rpc GetUser (GetUserRequest) returns (User);
  rpc ListUsers (ListUsersRequest) returns (stream User);
}

The field numbers (1, 2, 3) are how protobuf encodes data on the wire — not the field names. This is why you can rename a field safely; you must never reuse a field number.

HTTP/2 — the transport advantage

gRPC runs over HTTP/2, not HTTP/1.1. The key difference: HTTP/2 supports multiplexing — many concurrent request/response streams share one TCP connection without blocking each other (no head-of-line blocking). Combined with binary framing and header compression (HPACK), this gives gRPC substantially lower latency and higher throughput than HTTP/1.1 REST for high-frequency internal traffic.

The four streaming modes

RPC doesn't have to be a single call-and-response. gRPC's streaming modes map to different communication patterns:

Unary Client Server 1 req → 1 resp Server stream Client Server 1 req → N resp Client stream Client Server N req → 1 resp Bidirectional Client Server N req ↔ M resp Client → Server Server → Client Fetch one user Live feed / logs File upload Chat / collab
The four gRPC streaming modes. Orange = client messages; teal = server messages. Unary is the default; streaming modes layer on top of HTTP/2's multiplexed streams.
ModePatternTypical use-case
Unary1 request → 1 responseFetch a single resource, create a record
Server streaming1 request → N responsesLive log tail, real-time price feed, large dataset download
Client streamingN requests → 1 responseChunked file upload, batch ingest
BidirectionalN requests ↔ M responsesChat, real-time collaboration, sensor telemetry

Code generation — strong typing across languages

The proto file is the single source of truth. Run protoc against it and you get fully typed client stubs and server interfaces in Go, Java, Python, TypeScript, C++, and more. If the server adds a required field to a message, the generated client code won't compile until it provides that field. This catches whole classes of integration bugs at compile time rather than production time.

# Generate Go code from the proto file
protoc --go_out=. --go-grpc_out=. user_service.proto

# Generated client usage (Go)
conn, _ := grpc.Dial("user-svc:50051", grpc.WithTransportCredentials(...))
client  := pb.NewUserServiceClient(conn)
user, _ := client.GetUser(ctx, &pb.GetUserRequest{UserId: "42"})
fmt.Println(user.Name)  // type-checked at compile time

Browser limitations — where grpc-web fits in

Browsers cannot speak raw HTTP/2 framing directly to a gRPC server. This is a hard constraint — the browser's Fetch/XHR APIs don't expose the low-level HTTP/2 stream primitives that gRPC requires. The solution is grpc-web: a lightweight adaptation layer. The browser talks to an envoy proxy (or similar) over HTTP/1.1 or HTTP/2 using the grpc-web wire format; the proxy translates to native gRPC on the back-end.

Browser → (grpc-web / HTTP/2) → Envoy proxy → (gRPC / HTTP/2) → service

This adds operational complexity (the proxy is another moving part). For this reason, many teams use REST or GraphQL at the browser edge and gRPC only for internal service-to-service calls.

🎯 Interview angle

"When would you use gRPC instead of REST?" Key points: (1) internal service mesh where you control both client and server; (2) high-throughput, low-latency calls where binary encoding and HTTP/2 multiplexing matter; (3) strong typing across multiple languages — generated stubs prevent contract drift. Counter-point: (1) browser clients are awkward without grpc-web; (2) binary messages aren't human-readable for debugging.

⚠️ Common trap — not human-debuggable

With REST you can curl any endpoint and read the JSON response. gRPC messages are binary protobuf — curl shows garbage. You need grpcurl or a gRPC-aware tool, and you need the .proto file available at debug time. Teams that deploy gRPC without reflection enabled (which exposes the schema) or without a toolchain investment often struggle to diagnose production issues. Plan your observability story before committing to gRPC.

✅ Do this, not that

Do enable server reflection in development so grpcurl can discover your services without a separate proto file. Don't reuse field numbers in your proto messages when removing fields — old serialised data with that number will be silently misinterpreted as the new field type, causing subtle corruption bugs. Mark removed fields as reserved instead.

Under the hood: how a gRPC call travels on the wire

A gRPC call is not magic — it is a precisely structured sequence of HTTP/2 frames carrying a Protobuf-encoded message, with the gRPC status carried back in HTTP/2 trailers rather than headers. Here is what actually happens, byte by byte.

Step 1 — Protobuf encoding

The request message is serialised into Protobuf's binary tag-length-value (TLV) format. Each field is encoded as a varint combining its field number (from the .proto file) and a wire type. A string field (user_id = "42") with field number 1 encodes like this:

// GetUserRequest { user_id = "42" }
// Field 1, wire type 2 (length-delimited) → tag byte = (1 << 3) | 2 = 0x0a
// Value: 0x0a 0x02 0x34 0x32
//         ^tag ^len '4'  '2'
encoded bytes (hex): 0a 02 34 32   (4 bytes total)

Contrast with JSON {"user_id":"42"} at 16 bytes. For deeply nested messages with many fields, the ratio improves dramatically — protobuf routinely produces 3–10× smaller payloads than JSON for the same data.

Step 2 — gRPC message framing (the 5-byte prefix)

Before the Protobuf bytes are handed to HTTP/2, gRPC wraps them in a 5-byte length-prefix frame:

// gRPC message frame layout
Byte 0:     Compression flag (0 = not compressed, 1 = compressed)
Bytes 1–4:  Big-endian uint32 — length of the Protobuf message that follows
Bytes 5…N:  The Protobuf-encoded message bytes

// For GetUserRequest { user_id="42" } (4 bytes of protobuf):
00  00 00 00 04  0a 02 34 32
^   ^-------^   ^---------^
flag  length=4   protobuf payload

This framing allows multiple messages to be sent on the same HTTP/2 stream (as in streaming RPCs) — the receiver reads 5 bytes, parses the length, reads exactly that many bytes as one message, then reads the next 5-byte header, and so on. There is no delimiter character to scan for, just a length to count.

Step 3 — HTTP/2 HEADERS frame

The gRPC client opens an HTTP/2 stream and sends a HEADERS frame with these pseudo-headers and custom headers:

// HTTP/2 HEADERS frame for a unary gRPC call
:method  = POST
:scheme  = https
:path    = /user.UserService/GetUser    ← always /{package}.{Service}/{Method}
:authority = user-svc:50051
content-type   = application/grpc      ← must be present; +proto or +json variant
te             = trailers               ← mandatory; tells HTTP/2 proxies to preserve trailers
grpc-timeout   = 500m                  ← optional; 500 milliseconds deadline
authorization  = Bearer <token>        ← standard auth metadata

Step 4 — HTTP/2 DATA frame

Immediately after (or in the same write), the client sends a DATA frame carrying the 5-byte-prefixed Protobuf message. For a unary call the DATA frame has the END_STREAM flag set — no more data from the client on this stream:

// HTTP/2 DATA frame carrying the gRPC message
Stream ID:   3  (odd = client-initiated; each new call gets the next odd number)
Flags:       END_STREAM=1 (unary: client is done after one message)
Payload:     00 00 00 00 04 0a 02 34 32   (framing + protobuf)

Step 5 — server response HEADERS + DATA

The server sends its response HEADERS frame (with HTTP status 200 — note: gRPC always sends HTTP 200 even for errors; the real status is in trailers), followed by a DATA frame with the response message:

// Server response HEADERS frame
:status         = 200
content-type    = application/grpc+proto

// Server response DATA frame — the User message serialised
00 00 00 00 0c <protobuf bytes for User{id="42",name="Ada",email="..."}>

Step 6 — gRPC status in HTTP/2 TRAILERS

After the DATA frame, the server sends an HTTP/2 HEADERS frame with the END_STREAM flag — this is the trailers frame. It carries the definitive gRPC outcome:

// Trailers frame (HEADERS with END_STREAM set)
grpc-status   = 0       ← 0 = OK; non-zero = error
grpc-message  =         ← empty on success; URL-percent-encoded error description on failure

// Example error trailer — user not found
grpc-status   = 5       ← NOT_FOUND
grpc-message  = user%2042%20does%20not%20exist

This trailer-based status is why you cannot use a plain HTTP proxy for gRPC: the proxy must support and forward HTTP/2 trailers. It is also why curl --http2 alone won't work for gRPC — the 5-byte framing and trailer convention sit on top of HTTP/2, not inside it.

The four call types — what changes on the wire

Unary Server stream Client stream Bidirectional C S C S C S C S HEADERS DATA+ES HEADERS 200 DATA TRAILERS+ES (grpc-status) HEADERS+DATA+ES HEADERS 200 DATA msg₁ DATA msg₂ DATA msg₃ TRAILERS+ES HEADERS DATA chunk₁ DATA chunk₂ DATA+ES (last) HEADERS 200+DATA TRAILERS+ES HEADERS DATA msg₁→ HEADERS 200 DATA msg₂→ ←DATA resp₁ DATA msg₃+ES→ ←DATA resp₂ ←TRAILERS+ES Orange = client→server · Teal = server→client · ES = END_STREAM flag · TRAILERS carries grpc-status
Wire-level HTTP/2 frame sequence for each of the four gRPC call types. In every case the gRPC status is in the final TRAILERS frame, never in an HTTP 4xx/5xx response — the HTTP layer always returns 200.

The common gRPC status codes

grpc-statusNameTypical cause
0OKSuccess
1CANCELLEDClient cancelled the RPC (deadline exceeded client-side, or explicit cancel)
2UNKNOWNUnrecognised error on the server — often a panic or unmapped exception
3INVALID_ARGUMENTClient sent a malformed or out-of-range value
4DEADLINE_EXCEEDEDTimeout fired before the RPC completed
5NOT_FOUNDRequested entity does not exist
7PERMISSION_DENIEDAuthenticated user lacks the required permission
16UNAUTHENTICATEDNo or invalid credentials — check headers/metadata
8RESOURCE_EXHAUSTEDQuota or rate limit exceeded
14UNAVAILABLEServer not ready — safe to retry with backoff

Why a browser needs grpc-web — the framing constraint

Browsers expose HTTP as a high-level fetch/XHR API and deliberately hide the HTTP/2 framing layer. JavaScript cannot set HTTP/2 trailer frames, cannot read them, and cannot control the stream-level END_STREAM flag. gRPC's entire status model depends on trailers. The grpc-web protocol works around this by mapping trailers into the body of a regular HTTP response, framed with a special 5-byte prefix byte (the high bit of the compression flag set to 1). An Envoy proxy on the server side unpacks the grpc-web frames into real gRPC over HTTP/2 toward the upstream service.

How to debug & inspect it

gRPC's binary encoding means you cannot casually curl it — but with the right tools you can inspect calls as easily as REST.

grpcurl — the curl for gRPC

# List all services (requires server reflection to be enabled) $ grpcurl -plaintext user-svc:50051 list user.UserService grpc.reflection.v1alpha.ServerReflection # Describe a service method $ grpcurl -plaintext user-svc:50051 describe user.UserService.GetUser user.UserService.GetUser is a method: rpc GetUser ( .user.GetUserRequest ) returns ( .user.User ); # Make a unary call — JSON in, JSON out (grpcurl transcodes) $ grpcurl -plaintext -d '{"user_id":"42"}' user-svc:50051 user.UserService/GetUser { "id": "42", "name": "Ada Lovelace", "email": "ada@example.com" } # With TLS and an auth header (metadata) $ grpcurl -H "authorization: Bearer <token>" \ -d '{"user_id":"42"}' user-svc:50051 user.UserService/GetUser # Against a server without reflection — supply the proto file directly $ grpcurl -proto user_service.proto -plaintext \ -d '{"user_id":"42"}' user-svc:50051 user.UserService/GetUser # Stream all users for customer 7 (server streaming) $ grpcurl -plaintext -d '{"customer_id":"7"}' \ order-svc:50052 order.OrderService/ListOrders {"id":"ord-1","status":"shipped"} {"id":"ord-2","status":"pending"} {"id":"ord-3","status":"delivered"}

Reading the raw frames with Wireshark / tcpdump

# Capture on loopback (plaintext gRPC for dev); port 50051 $ tcpdump -i lo0 -w grpc-capture.pcap port 50051 # In Wireshark: filter "grpc" — it auto-decodes gRPC frames if you # supply the .proto file under Preferences → Protocols → Protobuf # Look for: DATA frames with 5-byte prefix, HEADERS frames, HEADERS with END_STREAM (trailers)

Symptom → cause → fix

SymptomLikely causeFix
grpc-status: 12 UNIMPLEMENTEDMethod name in the URL doesn't match the proto (case-sensitive, must be /package.Service/Method)Check exact package + service + method name from the .proto; regenerate stubs
grpc-status: 16 UNAUTHENTICATEDMissing or invalid authorization metadata; or server expects TLS and client sent plaintextAdd -H "authorization: Bearer ..." to grpcurl; confirm TLS vs plaintext flags match
grpc-status: 14 UNAVAILABLE on first call then OKConnection pool not yet warmed; or load balancer health check not passedImplement a keepalive ping or wait for the LB health; retry with exponential backoff
grpcurl shows garbage / "frame too large"Connecting to a non-gRPC port (e.g. REST on 8080) or a grpc-web endpoint not a native gRPC oneConfirm the port; use the correct -plaintext or TLS flag
Browser gets 502 / connection reset on grpc-web callsEnvoy proxy not configured with grpc_web filter; missing te: trailers header passthroughAdd the Envoy grpc_web HTTP filter; ensure te header is not stripped
Silent data corruption — wrong field valuesA field number was reused after removing an old fieldMark removed fields as reserved in the .proto; regenerate all stubs
grpc-status: 4 DEADLINE_EXCEEDEDServer processing time exceeded the client-set deadline in grpc-timeoutTune deadline on the call; check server latency with the grpc_server_handling_seconds histogram metric

Debug checklist:

  1. Confirm server reflection is enabled (grpcurl list should return your service name).
  2. Use grpcurl describe to verify the method signature matches your client code.
  3. Check the grpc-status in the trailing metadata — it is always there, not in HTTP status.
  4. Look at the grpc-message trailer — it often contains a human-readable error description.
  5. For browser-to-service failures, confirm the Envoy proxy has the grpc_web filter and is not stripping te: trailers.
  6. Enable gRPC access logging on the server to see per-method latency, error rate, and message sizes.
⚠️ gRPC always returns HTTP 200 — do not look there for errors

This trips up monitoring setups and proxy configurations. A gRPC call that fails returns HTTP 200 with a non-zero grpc-status trailer. Load balancers and APM tools that count HTTP 5xx errors as failures will report 0% error rate on a gRPC service that is 100% returning NOT_FOUND or UNAUTHENTICATED. You must instrument at the gRPC layer — look at grpc-status != 0 in your metrics and trace spans, not at HTTP status codes.

🧠 Quick check

1. What is the primary role of a .proto file in gRPC?

The .proto file is the contract — it defines message types and service methods in the Protocol Buffer Schema Definition Language. The protoc compiler turns it into type-safe stubs for both client and server in any supported language.

2. Which gRPC streaming mode best fits "upload a large video file in chunks, receive a single confirmation when done"?

Client streaming: the client sends a stream of messages (chunks) and the server replies with one response at the end (confirmation). This is exactly the upload-then-confirm pattern.

3. Why can't a standard browser use gRPC directly without grpc-web?

Browsers expose high-level HTTP APIs that abstract away framing. gRPC needs to control HTTP/2 trailers and stream frames directly — capabilities the browser sandbox doesn't expose. grpc-web is a compatible adaptation layer that proxies the translation.

✍️ Exercise: write a proto service for an order system (try before opening)

Define a proto service for an order management system. It needs: (a) fetch a single order by id; (b) stream all orders for a given customer (the list may be large); (c) create an order. Include message types for each.

Model answer:

syntax = "proto3";

message Order {
  string id          = 1;
  string customer_id = 2;
  double total       = 3;
  string status      = 4;
}

message GetOrderRequest       { string order_id    = 1; }
message ListOrdersRequest     { string customer_id = 1; }
message CreateOrderRequest    { string customer_id = 1; double total = 2; }

service OrderService {
  rpc GetOrder    (GetOrderRequest)    returns (Order);
  rpc ListOrders  (ListOrdersRequest)  returns (stream Order);
  rpc CreateOrder (CreateOrderRequest) returns (Order);
}

Rubric: ✓ unary for GetOrder ✓ server streaming for ListOrders (stream keyword on return) ✓ unary for CreateOrder ✓ messages have field numbers starting at 1 ✓ no verb in service name (noun "OrderService").

Key takeaways

Sources & further reading