Architectural Styles · Lesson 05
The gRPC framework
gRPC treats a remote call like a local function call — but compiles the contract into type-safe client and server code, then sends everything as compact binary over HTTP/2. The result is a style purpose-built for internal service meshes that need speed and strict contracts.
By the end you'll be able to
- Explain what RPC means and how gRPC builds on the concept with Protocol Buffers and HTTP/2.
- Name and describe the four gRPC streaming modes.
- State gRPC's browser limitations and when grpc-web is the solution.
RPC — the original idea
Remote Procedure Call (RPC) is the idea of calling a function on a different machine as if it were local. Your code writes payments.charge(amount, card); the network layer transparently serialises the arguments, sends them to the payments service, executes the function there, and returns the result. The network is meant to be invisible.
The concept dates to the 1970s. Early implementations (CORBA, XML-RPC, SOAP) were famously complex. gRPC, released by Google in 2016, took the core idea and rebuilt it with modern primitives: Protocol Buffers for the contract, HTTP/2 for the transport, and code generation for every popular language.
Protocol Buffers — the contract language
Everything in gRPC starts with a .proto file. It defines messages (typed data structures, like database rows with field numbers) and services (groups of RPC methods). The proto compiler (protoc) reads the file and generates client stubs and server skeletons in your language of choice.
// user_service.proto
syntax = "proto3";
// Message definitions
message GetUserRequest {
string user_id = 1;
}
message User {
string id = 1;
string name = 2;
string email = 3;
}
// Service definition
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User);
}
The field numbers (1, 2, 3) are how protobuf encodes data on the wire — not the field names. This is why you can rename a field safely; you must never reuse a field number.
HTTP/2 — the transport advantage
gRPC runs over HTTP/2, not HTTP/1.1. The key difference: HTTP/2 supports multiplexing — many concurrent request/response streams share one TCP connection without blocking each other (no head-of-line blocking). Combined with binary framing and header compression (HPACK), this gives gRPC substantially lower latency and higher throughput than HTTP/1.1 REST for high-frequency internal traffic.
The four streaming modes
RPC doesn't have to be a single call-and-response. gRPC's streaming modes map to different communication patterns:
| Mode | Pattern | Typical use-case |
|---|---|---|
| Unary | 1 request → 1 response | Fetch a single resource, create a record |
| Server streaming | 1 request → N responses | Live log tail, real-time price feed, large dataset download |
| Client streaming | N requests → 1 response | Chunked file upload, batch ingest |
| Bidirectional | N requests ↔ M responses | Chat, real-time collaboration, sensor telemetry |
Code generation — strong typing across languages
The proto file is the single source of truth. Run protoc against it and you get fully typed client stubs and server interfaces in Go, Java, Python, TypeScript, C++, and more. If the server adds a required field to a message, the generated client code won't compile until it provides that field. This catches whole classes of integration bugs at compile time rather than production time.
# Generate Go code from the proto file
protoc --go_out=. --go-grpc_out=. user_service.proto
# Generated client usage (Go)
conn, _ := grpc.Dial("user-svc:50051", grpc.WithTransportCredentials(...))
client := pb.NewUserServiceClient(conn)
user, _ := client.GetUser(ctx, &pb.GetUserRequest{UserId: "42"})
fmt.Println(user.Name) // type-checked at compile time
Browser limitations — where grpc-web fits in
Browsers cannot speak raw HTTP/2 framing directly to a gRPC server. This is a hard constraint — the browser's Fetch/XHR APIs don't expose the low-level HTTP/2 stream primitives that gRPC requires. The solution is grpc-web: a lightweight adaptation layer. The browser talks to an envoy proxy (or similar) over HTTP/1.1 or HTTP/2 using the grpc-web wire format; the proxy translates to native gRPC on the back-end.
This adds operational complexity (the proxy is another moving part). For this reason, many teams use REST or GraphQL at the browser edge and gRPC only for internal service-to-service calls.
"When would you use gRPC instead of REST?" Key points: (1) internal service mesh where you control both client and server; (2) high-throughput, low-latency calls where binary encoding and HTTP/2 multiplexing matter; (3) strong typing across multiple languages — generated stubs prevent contract drift. Counter-point: (1) browser clients are awkward without grpc-web; (2) binary messages aren't human-readable for debugging.
With REST you can curl any endpoint and read the JSON response. gRPC messages are binary protobuf — curl shows garbage. You need grpcurl or a gRPC-aware tool, and you need the .proto file available at debug time. Teams that deploy gRPC without reflection enabled (which exposes the schema) or without a toolchain investment often struggle to diagnose production issues. Plan your observability story before committing to gRPC.
Do enable server reflection in development so grpcurl can discover your services without a separate proto file. Don't reuse field numbers in your proto messages when removing fields — old serialised data with that number will be silently misinterpreted as the new field type, causing subtle corruption bugs. Mark removed fields as reserved instead.
Under the hood: how a gRPC call travels on the wire
A gRPC call is not magic — it is a precisely structured sequence of HTTP/2 frames carrying a Protobuf-encoded message, with the gRPC status carried back in HTTP/2 trailers rather than headers. Here is what actually happens, byte by byte.
Step 1 — Protobuf encoding
The request message is serialised into Protobuf's binary tag-length-value (TLV) format. Each field is encoded as a varint combining its field number (from the .proto file) and a wire type. A string field (user_id = "42") with field number 1 encodes like this:
// GetUserRequest { user_id = "42" }
// Field 1, wire type 2 (length-delimited) → tag byte = (1 << 3) | 2 = 0x0a
// Value: 0x0a 0x02 0x34 0x32
// ^tag ^len '4' '2'
encoded bytes (hex): 0a 02 34 32 (4 bytes total)
Contrast with JSON {"user_id":"42"} at 16 bytes. For deeply nested messages with many fields, the ratio improves dramatically — protobuf routinely produces 3–10× smaller payloads than JSON for the same data.
Step 2 — gRPC message framing (the 5-byte prefix)
Before the Protobuf bytes are handed to HTTP/2, gRPC wraps them in a 5-byte length-prefix frame:
// gRPC message frame layout
Byte 0: Compression flag (0 = not compressed, 1 = compressed)
Bytes 1–4: Big-endian uint32 — length of the Protobuf message that follows
Bytes 5…N: The Protobuf-encoded message bytes
// For GetUserRequest { user_id="42" } (4 bytes of protobuf):
00 00 00 00 04 0a 02 34 32
^ ^-------^ ^---------^
flag length=4 protobuf payload
This framing allows multiple messages to be sent on the same HTTP/2 stream (as in streaming RPCs) — the receiver reads 5 bytes, parses the length, reads exactly that many bytes as one message, then reads the next 5-byte header, and so on. There is no delimiter character to scan for, just a length to count.
Step 3 — HTTP/2 HEADERS frame
The gRPC client opens an HTTP/2 stream and sends a HEADERS frame with these pseudo-headers and custom headers:
// HTTP/2 HEADERS frame for a unary gRPC call
:method = POST
:scheme = https
:path = /user.UserService/GetUser ← always /{package}.{Service}/{Method}
:authority = user-svc:50051
content-type = application/grpc ← must be present; +proto or +json variant
te = trailers ← mandatory; tells HTTP/2 proxies to preserve trailers
grpc-timeout = 500m ← optional; 500 milliseconds deadline
authorization = Bearer <token> ← standard auth metadata
Step 4 — HTTP/2 DATA frame
Immediately after (or in the same write), the client sends a DATA frame carrying the 5-byte-prefixed Protobuf message. For a unary call the DATA frame has the END_STREAM flag set — no more data from the client on this stream:
// HTTP/2 DATA frame carrying the gRPC message
Stream ID: 3 (odd = client-initiated; each new call gets the next odd number)
Flags: END_STREAM=1 (unary: client is done after one message)
Payload: 00 00 00 00 04 0a 02 34 32 (framing + protobuf)
Step 5 — server response HEADERS + DATA
The server sends its response HEADERS frame (with HTTP status 200 — note: gRPC always sends HTTP 200 even for errors; the real status is in trailers), followed by a DATA frame with the response message:
// Server response HEADERS frame
:status = 200
content-type = application/grpc+proto
// Server response DATA frame — the User message serialised
00 00 00 00 0c <protobuf bytes for User{id="42",name="Ada",email="..."}>
Step 6 — gRPC status in HTTP/2 TRAILERS
After the DATA frame, the server sends an HTTP/2 HEADERS frame with the END_STREAM flag — this is the trailers frame. It carries the definitive gRPC outcome:
// Trailers frame (HEADERS with END_STREAM set)
grpc-status = 0 ← 0 = OK; non-zero = error
grpc-message = ← empty on success; URL-percent-encoded error description on failure
// Example error trailer — user not found
grpc-status = 5 ← NOT_FOUND
grpc-message = user%2042%20does%20not%20exist
This trailer-based status is why you cannot use a plain HTTP proxy for gRPC: the proxy must support and forward HTTP/2 trailers. It is also why curl --http2 alone won't work for gRPC — the 5-byte framing and trailer convention sit on top of HTTP/2, not inside it.
The four call types — what changes on the wire
The common gRPC status codes
| grpc-status | Name | Typical cause |
|---|---|---|
| 0 | OK | Success |
| 1 | CANCELLED | Client cancelled the RPC (deadline exceeded client-side, or explicit cancel) |
| 2 | UNKNOWN | Unrecognised error on the server — often a panic or unmapped exception |
| 3 | INVALID_ARGUMENT | Client sent a malformed or out-of-range value |
| 4 | DEADLINE_EXCEEDED | Timeout fired before the RPC completed |
| 5 | NOT_FOUND | Requested entity does not exist |
| 7 | PERMISSION_DENIED | Authenticated user lacks the required permission |
| 16 | UNAUTHENTICATED | No or invalid credentials — check headers/metadata |
| 8 | RESOURCE_EXHAUSTED | Quota or rate limit exceeded |
| 14 | UNAVAILABLE | Server not ready — safe to retry with backoff |
Why a browser needs grpc-web — the framing constraint
Browsers expose HTTP as a high-level fetch/XHR API and deliberately hide the HTTP/2 framing layer. JavaScript cannot set HTTP/2 trailer frames, cannot read them, and cannot control the stream-level END_STREAM flag. gRPC's entire status model depends on trailers. The grpc-web protocol works around this by mapping trailers into the body of a regular HTTP response, framed with a special 5-byte prefix byte (the high bit of the compression flag set to 1). An Envoy proxy on the server side unpacks the grpc-web frames into real gRPC over HTTP/2 toward the upstream service.
How to debug & inspect it
gRPC's binary encoding means you cannot casually curl it — but with the right tools you can inspect calls as easily as REST.
grpcurl — the curl for gRPC
Reading the raw frames with Wireshark / tcpdump
Symptom → cause → fix
| Symptom | Likely cause | Fix |
|---|---|---|
grpc-status: 12 UNIMPLEMENTED | Method name in the URL doesn't match the proto (case-sensitive, must be /package.Service/Method) | Check exact package + service + method name from the .proto; regenerate stubs |
grpc-status: 16 UNAUTHENTICATED | Missing or invalid authorization metadata; or server expects TLS and client sent plaintext | Add -H "authorization: Bearer ..." to grpcurl; confirm TLS vs plaintext flags match |
grpc-status: 14 UNAVAILABLE on first call then OK | Connection pool not yet warmed; or load balancer health check not passed | Implement a keepalive ping or wait for the LB health; retry with exponential backoff |
| grpcurl shows garbage / "frame too large" | Connecting to a non-gRPC port (e.g. REST on 8080) or a grpc-web endpoint not a native gRPC one | Confirm the port; use the correct -plaintext or TLS flag |
| Browser gets 502 / connection reset on grpc-web calls | Envoy proxy not configured with grpc_web filter; missing te: trailers header passthrough | Add the Envoy grpc_web HTTP filter; ensure te header is not stripped |
| Silent data corruption — wrong field values | A field number was reused after removing an old field | Mark removed fields as reserved in the .proto; regenerate all stubs |
grpc-status: 4 DEADLINE_EXCEEDED | Server processing time exceeded the client-set deadline in grpc-timeout | Tune deadline on the call; check server latency with the grpc_server_handling_seconds histogram metric |
Debug checklist:
- Confirm server reflection is enabled (
grpcurl listshould return your service name). - Use
grpcurl describeto verify the method signature matches your client code. - Check the grpc-status in the trailing metadata — it is always there, not in HTTP status.
- Look at the
grpc-messagetrailer — it often contains a human-readable error description. - For browser-to-service failures, confirm the Envoy proxy has the grpc_web filter and is not stripping
te: trailers. - Enable gRPC access logging on the server to see per-method latency, error rate, and message sizes.
This trips up monitoring setups and proxy configurations. A gRPC call that fails returns HTTP 200 with a non-zero grpc-status trailer. Load balancers and APM tools that count HTTP 5xx errors as failures will report 0% error rate on a gRPC service that is 100% returning NOT_FOUND or UNAUTHENTICATED. You must instrument at the gRPC layer — look at grpc-status != 0 in your metrics and trace spans, not at HTTP status codes.
🧠 Quick check
1. What is the primary role of a .proto file in gRPC?
The .proto file is the contract — it defines message types and service methods in the Protocol Buffer Schema Definition Language. The protoc compiler turns it into type-safe stubs for both client and server in any supported language.
2. Which gRPC streaming mode best fits "upload a large video file in chunks, receive a single confirmation when done"?
Client streaming: the client sends a stream of messages (chunks) and the server replies with one response at the end (confirmation). This is exactly the upload-then-confirm pattern.
3. Why can't a standard browser use gRPC directly without grpc-web?
Browsers expose high-level HTTP APIs that abstract away framing. gRPC needs to control HTTP/2 trailers and stream frames directly — capabilities the browser sandbox doesn't expose. grpc-web is a compatible adaptation layer that proxies the translation.
✍️ Exercise: write a proto service for an order system (try before opening)
Define a proto service for an order management system. It needs: (a) fetch a single order by id; (b) stream all orders for a given customer (the list may be large); (c) create an order. Include message types for each.
Model answer:
syntax = "proto3";
message Order {
string id = 1;
string customer_id = 2;
double total = 3;
string status = 4;
}
message GetOrderRequest { string order_id = 1; }
message ListOrdersRequest { string customer_id = 1; }
message CreateOrderRequest { string customer_id = 1; double total = 2; }
service OrderService {
rpc GetOrder (GetOrderRequest) returns (Order);
rpc ListOrders (ListOrdersRequest) returns (stream Order);
rpc CreateOrder (CreateOrderRequest) returns (Order);
}
Rubric: ✓ unary for GetOrder ✓ server streaming for ListOrders (stream keyword on return) ✓ unary for CreateOrder ✓ messages have field numbers starting at 1 ✓ no verb in service name (noun "OrderService").
Key takeaways
- gRPC = RPC + Protocol Buffers (typed binary schema) + HTTP/2 (multiplexed, low-latency transport).
- The .proto file is the contract; code generation ensures type-safe stubs in every language.
- Four streaming modes: unary, server-stream, client-stream, bidirectional — choose based on who sends many messages.
- Excellent for internal microservices; browser clients need the grpc-web proxy adapter.
- Binary encoding makes gRPC fast but not human-readable — invest in
grpcurland server reflection for debuggability.