Foundations · Lesson 08
Remote procedure calls (RPC)
RPC is a style of API with one goal: make calling a function on another machine feel exactly like calling one in your own code — user = getUser(42), no URLs in sight. It's a powerful illusion. Knowing where the illusion leaks is what separates a working system from a fragile one.
By the end you'll be able to
- Explain RPC and the role of stubs and (un)marshalling.
- Name the ways a remote call differs from a local one — the leaks in the abstraction.
- Say what gRPC adds and when RPC beats a REST-style API.
The illusion: a local call that isn't
In ordinary code you call add(2, 3) and get 5. RPC lets you write what looks like that, but the function runs on a server somewhere else. The machinery that sells the illusion is a pair of stubs — small auto-generated shims on each side:
Marshalling (a.k.a. serialization) is turning in-memory values into a byte stream the network can carry; unmarshalling rebuilds them on the other side. The caller writes a normal-looking function call and never touches sockets — that's the whole appeal.
Where the illusion leaks
A local function call is fast, always returns, and can't half-happen. A remote one can't promise any of that. These are the leaks every RPC system must face:
| Local call | Remote call (the leak) |
|---|---|
| Nanoseconds | Milliseconds — the network round trip dominates (Lesson 03) |
| Always returns | Can time out — and you may not know if it ran |
| Can't partially fail | Request can arrive but the reply get lost → did it happen? |
| Same memory & types | Different machines/languages → must agree on a wire format |
Treating remote calls as if they were local — sprinkling them inside loops or trusting they "just return." A loop of 1,000 RPCs is 1,000 network round trips. And because a timeout leaves you unsure whether the call ran, blind retries can double-charge a card or double-send a message. The fix is idempotency (its own lesson) so a retry is safe — design for it from the start.
This is the famous "fallacies of distributed computing" territory. If you can name even three leaks — the network isn't instant, calls can fail in the middle, and you can't tell a lost request from a lost reply — and then say "so I'd make the operation idempotent and set explicit timeouts," you sound like someone who's actually run services in production.
gRPC: the modern, fast RPC
The most common RPC framework today is gRPC. Its two big ideas:
- Protocol Buffers (protobuf) — you define your messages and methods once in a
.protoschema; tooling generates the stubs in many languages and serializes to a compact binary format (smaller and faster to parse than JSON text). - Runs over HTTP/2 — so it inherits multiplexed streams (Lesson 02), enabling efficient streaming RPCs, not just one-shot request/response.
// a tiny .proto contract — the source of truth for both sides
service Users {
rpc GetUser (GetUserRequest) returns (User);
}
message GetUserRequest { int64 id = 1; }
message User { int64 id = 1; string name = 2; }
RPC/gRPC shines inside your system — service-to-service calls where you control both ends, want low latency, compact payloads, and streaming (e.g. a checkout service calling an inventory service). REST (next module) shines at the edge — public, browser-friendly, cacheable APIs where ubiquity and human-readability matter more than raw speed. Many real systems use both: gRPC between internal services, REST/JSON to the outside world.
Under the hood: how it actually works
Two mechanisms make gRPC work: protobuf wire encoding (how arguments become bytes) and HTTP/2 framing (how those bytes travel and return).
Protobuf wire encoding
Protobuf uses a compact binary format. Every field in a message is identified by a small integer field tag (not by name — that's why field numbers must never change). Each encoded value is a tag-wire-type varint followed by the value. For the proto definition:
message GetUserRequest { int64 id = 1; }
// Encoding GetUserRequest { id: 42 }:
// field 1, wire type 0 (varint) → tag byte = (1 << 3) | 0 = 0x08
// value 42 encoded as varint → 0x2A
// total: 2 bytes [0x08, 0x2A]
message User { int64 id = 1; string name = 2; }
// Encoding User { id: 42, name: "Ada" }:
// field 1 varint 42 → 0x08 0x2A
// field 2 length-delimited "Ada" → tag = (2<<3)|2 = 0x12,
// length = 0x03, bytes = 0x41 0x64 0x61
// total: 7 bytes vs ~22 bytes for {"id":42,"name":"Ada"} in JSON
The field name is never in the wire format — only the number. This makes the binary compact and fast to parse (no string comparisons), but it means a schema (.proto file) is required to decode arbitrary messages.
How a gRPC call maps onto HTTP/2 frames
A unary gRPC call (GetUser) runs entirely inside a single HTTP/2 stream. Here is the exact frame sequence:
The gRPC-specific layer sits inside HTTP/2 DATA frames: a 5-byte framing prefix ([1-byte compressed flag][4-byte message length]) wraps each protobuf message. The grpc-status code arrives in HTTP/2 trailers (headers sent after the body), not in the initial response headers — this is why the client must read the full stream to know if the call succeeded.
How to debug & inspect it
gRPC's binary format can't be read with plain curl, but grpcurl (a curl-equivalent for gRPC) makes it easy — especially with server reflection enabled.
| Symptom / grpc-status | Code meaning | Likely cause | Fix |
|---|---|---|---|
grpc-status: 0 | OK | — | — |
grpc-status: 1 CANCELLED | Client cancelled the call | Client timeout elapsed or explicit cancel | Check client-side deadline; increase timeout if legitimate |
grpc-status: 2 UNKNOWN | Server threw an unhandled exception | Unhandled panic/exception in handler; see server logs | Add proper error handling; return a typed gRPC status |
grpc-status: 4 DEADLINE_EXCEEDED | Server didn't respond within the deadline | Slow handler, downstream dependency slow, network partition | Check server logs for slow queries; add tracing |
grpc-status: 5 NOT_FOUND | Entity not found | Resource doesn't exist | Validate input before calling; handle gracefully |
grpc-status: 7 PERMISSION_DENIED | AuthZ check failed | Missing or insufficient credentials | Check token scopes; verify interceptor/auth middleware |
grpc-status: 12 UNIMPLEMENTED | Method not implemented on server | Method name mismatch, server not deployed yet, wrong port | Confirm service name + method name exactly match the .proto; check the correct server is running |
grpc-status: 14 UNAVAILABLE | Server is down or unreachable | Connection refused, network partition, server crashed | Check server health; implement retry with backoff for transient failures |
Debug checklist:
- Enable server reflection in development — it lets
grpcurl listanddescribework without needing the.protofile. - Read
grpc-statusin trailers first — it's the equivalent of the HTTP status code for gRPC calls. - Use
grpcurl -vto see raw headers and trailers includinggrpc-message(human-readable error detail). - If you can't connect at all, check that the server is listening on the correct port with
grpc+http2(not plain HTTP/1.1). - For "works locally, fails in prod" — check TLS: production gRPC servers require a valid cert; use
grpcurlwithout-plaintextfor TLS.
Because protobuf uses field numbers (not names) on the wire, renaming a field is safe — but changing its number or removing it without marking it reserved silently corrupts data. Old clients will misinterpret bytes when talking to a server compiled from the new schema. Always mark deleted fields reserved and never reuse numbers.
🧠 Quick check
1. In RPC, "marshalling" refers to:
Marshalling/serialization packs arguments into a byte stream; the receiver unmarshals them back into values. Encryption and load balancing are separate concerns.
2. Why is blindly retrying a timed-out RPC dangerous?
A timeout doesn't tell you whether the call ran. Without idempotency, retrying can charge a card twice. Idempotency makes the retry safe.
3. Two reasons gRPC is fast for internal service calls:
protobuf is smaller and quicker to parse than JSON text, and HTTP/2 lets many calls share one connection with streaming. gRPC is explicitly multi-language.
✍️ Drill: pick a protocol for two call sites
(a) A public mobile app fetches a user's profile. (b) Your order service calls your inventory service 50× per checkout. Which style for each, and why? Decide first.
Model answer: (a) REST/JSON over HTTPS — public, diverse clients, cacheable, human-debuggable; ubiquity matters more than microseconds. (b) gRPC — internal, you own both ends, high call volume, want compact binary payloads and low latency; streaming helps if inventory checks batch. Bonus: make the inventory deduction idempotent so a retried call can't double-decrement stock.
Rubric: ✓ REST at the public edge ✓ gRPC internally ✓ justifies by control/volume/latency ✓ remembers idempotency on the retried internal call.
Key takeaways
- RPC makes a remote call look local via auto-generated stubs that marshal/unmarshal arguments.
- The illusion leaks: remote calls are slow, can time out, can fail mid-flight, and need an agreed wire format.
- A timeout ≠ failure — design for idempotency so retries are safe.
- gRPC = protobuf (compact binary) + HTTP/2 (multiplexed streaming); great inside a system, REST for the public edge.