Foundations · Lesson 08

Remote procedure calls (RPC)

RPC is a style of API with one goal: make calling a function on another machine feel exactly like calling one in your own code — user = getUser(42), no URLs in sight. It's a powerful illusion. Knowing where the illusion leaks is what separates a working system from a fragile one.

⏱ 11 minDifficulty: corePrereq: Lessons 02, 04

By the end you'll be able to

Explain RPC and the role of stubs and (un)marshalling.
Name the ways a remote call differs from a local one — the leaks in the abstraction.
Say what gRPC adds and when RPC beats a REST-style API.

The illusion: a local call that isn't

In ordinary code you call add(2, 3) and get 5. RPC lets you write what looks like that, but the function runs on a server somewhere else. The machinery that sells the illusion is a pair of stubs — small auto-generated shims on each side:

The client stub packs the arguments into bytes (marshalling), ships them, the server stub unpacks them, runs the real function, and sends the result back the same way.

Marshalling (a.k.a. serialization) is turning in-memory values into a byte stream the network can carry; unmarshalling rebuilds them on the other side. The caller writes a normal-looking function call and never touches sockets — that's the whole appeal.

Where the illusion leaks

A local function call is fast, always returns, and can't half-happen. A remote one can't promise any of that. These are the leaks every RPC system must face:

Local call	Remote call (the leak)
Nanoseconds	Milliseconds — the network round trip dominates (Lesson 03)
Always returns	Can time out — and you may not know if it ran
Can't partially fail	Request can arrive but the reply get lost → did it happen?
Same memory & types	Different machines/languages → must agree on a wire format

⚠️ Common trap

Treating remote calls as if they were local — sprinkling them inside loops or trusting they "just return." A loop of 1,000 RPCs is 1,000 network round trips. And because a timeout leaves you unsure whether the call ran, blind retries can double-charge a card or double-send a message. The fix is idempotency (its own lesson) so a retry is safe — design for it from the start.

🎯 Interview angle

This is the famous "fallacies of distributed computing" territory. If you can name even three leaks — the network isn't instant, calls can fail in the middle, and you can't tell a lost request from a lost reply — and then say "so I'd make the operation idempotent and set explicit timeouts," you sound like someone who's actually run services in production.

gRPC: the modern, fast RPC

The most common RPC framework today is gRPC. Its two big ideas:

Protocol Buffers (protobuf) — you define your messages and methods once in a .proto schema; tooling generates the stubs in many languages and serializes to a compact binary format (smaller and faster to parse than JSON text).
Runs over HTTP/2 — so it inherits multiplexed streams (Lesson 02), enabling efficient streaming RPCs, not just one-shot request/response.

// a tiny .proto contract — the source of truth for both sides
service Users {
  rpc GetUser (GetUserRequest) returns (User);
}
message GetUserRequest { int64 id = 1; }
message User { int64 id = 1; string name = 2; }

✅ When to reach for RPC vs REST

RPC/gRPC shines inside your system — service-to-service calls where you control both ends, want low latency, compact payloads, and streaming (e.g. a checkout service calling an inventory service). REST (next module) shines at the edge — public, browser-friendly, cacheable APIs where ubiquity and human-readability matter more than raw speed. Many real systems use both: gRPC between internal services, REST/JSON to the outside world.

Under the hood: how it actually works

Two mechanisms make gRPC work: protobuf wire encoding (how arguments become bytes) and HTTP/2 framing (how those bytes travel and return).

Protobuf wire encoding

Protobuf uses a compact binary format. Every field in a message is identified by a small integer field tag (not by name — that's why field numbers must never change). Each encoded value is a tag-wire-type varint followed by the value. For the proto definition:

message GetUserRequest { int64 id = 1; }
// Encoding GetUserRequest { id: 42 }:
//   field 1, wire type 0 (varint) → tag byte = (1 << 3) | 0 = 0x08
//   value 42 encoded as varint → 0x2A
//   total: 2 bytes  [0x08, 0x2A]

message User { int64 id = 1; string name = 2; }
// Encoding User { id: 42, name: "Ada" }:
//   field 1 varint 42   → 0x08 0x2A
//   field 2 length-delimited "Ada" → tag = (2<<3)|2 = 0x12,
//                                     length = 0x03, bytes = 0x41 0x64 0x61
//   total: 7 bytes vs ~22 bytes for {"id":42,"name":"Ada"} in JSON

The field name is never in the wire format — only the number. This makes the binary compact and fast to parse (no string comparisons), but it means a schema (.proto file) is required to decode arbitrary messages.

How a gRPC call maps onto HTTP/2 frames

A unary gRPC call (GetUser) runs entirely inside a single HTTP/2 stream. Here is the exact frame sequence:

Client Server ────────────────────────────────────────────────────── HEADERS frame stream_id=1 :method = POST :path = /users.Users/GetUser :scheme = https content-type = application/grpc te = trailers ↓ DATA frame stream_id=1 (5-byte grpc framing + protobuf body) [compressed_flag=0][message_length=2][0x08 0x2A] ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ HEADERS frame stream_id=1 :status = 200 content-type = application/grpc DATA frame stream_id=1 [0][7 bytes = User{id:42,name:"Ada"}] HEADERS frame (trailers) END_STREAM grpc-status = 0 ← 0 = OK grpc-message = ""

The gRPC-specific layer sits inside HTTP/2 DATA frames: a 5-byte framing prefix ([1-byte compressed flag][4-byte message length]) wraps each protobuf message. The grpc-status code arrives in HTTP/2 trailers (headers sent after the body), not in the initial response headers — this is why the client must read the full stream to know if the call succeeded.

A unary gRPC call is one HTTP/2 stream. The client sends HEADERS + DATA; the server responds with HEADERS + DATA then closes the stream with trailers carrying grpc-status.

How to debug & inspect it

gRPC's binary format can't be read with plain curl, but grpcurl (a curl-equivalent for gRPC) makes it easy — especially with server reflection enabled.

# List all services exposed by the server (requires server reflection) $ grpcurl -plaintext localhost:50051 list users.Users grpc.reflection.v1alpha.ServerReflection # Describe a service's methods and message types $ grpcurl -plaintext localhost:50051 describe users.Users users.Users is a service: service Users { rpc GetUser ( .users.GetUserRequest ) returns ( .users.User ); } # Make a call (pass request as JSON — grpcurl handles proto encoding) $ grpcurl -plaintext -d '{"id": 42}' localhost:50051 users.Users/GetUser { "id": "42", "name": "Ada Lovelace" } # With TLS (production servers) $ grpcurl -d '{"id": 42}' api.example.com:443 users.Users/GetUser # If server reflection is disabled, provide the .proto file directly $ grpcurl -proto users.proto -d '{"id": 42}' localhost:50051 users.Users/GetUser # Verbose mode shows HTTP/2 frames and gRPC trailers $ grpcurl -v -d '{"id": 42}' localhost:50051 users.Users/GetUser Resolved method descriptor: … Response headers received: content-type: application/grpc Response trailers received: grpc-status: 0 Sent 1 request and received 1 response

Symptom / grpc-status	Code meaning	Likely cause	Fix
`grpc-status: 0`	OK	—	—
`grpc-status: 1` CANCELLED	Client cancelled the call	Client timeout elapsed or explicit cancel	Check client-side deadline; increase timeout if legitimate
`grpc-status: 2` UNKNOWN	Server threw an unhandled exception	Unhandled panic/exception in handler; see server logs	Add proper error handling; return a typed gRPC status
`grpc-status: 4` DEADLINE_EXCEEDED	Server didn't respond within the deadline	Slow handler, downstream dependency slow, network partition	Check server logs for slow queries; add tracing
`grpc-status: 5` NOT_FOUND	Entity not found	Resource doesn't exist	Validate input before calling; handle gracefully
`grpc-status: 7` PERMISSION_DENIED	AuthZ check failed	Missing or insufficient credentials	Check token scopes; verify interceptor/auth middleware
`grpc-status: 12` UNIMPLEMENTED	Method not implemented on server	Method name mismatch, server not deployed yet, wrong port	Confirm service name + method name exactly match the .proto; check the correct server is running
`grpc-status: 14` UNAVAILABLE	Server is down or unreachable	Connection refused, network partition, server crashed	Check server health; implement retry with backoff for transient failures

Debug checklist:

Enable server reflection in development — it lets grpcurl list and describe work without needing the .proto file.
Read grpc-status in trailers first — it's the equivalent of the HTTP status code for gRPC calls.
Use grpcurl -v to see raw headers and trailers including grpc-message (human-readable error detail).
If you can't connect at all, check that the server is listening on the correct port with grpc+http2 (not plain HTTP/1.1).
For "works locally, fails in prod" — check TLS: production gRPC servers require a valid cert; use grpcurl without -plaintext for TLS.

⚠️ Never change field numbers in a .proto

Because protobuf uses field numbers (not names) on the wire, renaming a field is safe — but changing its number or removing it without marking it reserved silently corrupts data. Old clients will misinterpret bytes when talking to a server compiled from the new schema. Always mark deleted fields reserved and never reuse numbers.

🧠 Quick check

1. In RPC, "marshalling" refers to:

Marshalling/serialization packs arguments into a byte stream; the receiver unmarshals them back into values. Encryption and load balancing are separate concerns.

2. Why is blindly retrying a timed-out RPC dangerous?

A timeout doesn't tell you whether the call ran. Without idempotency, retrying can charge a card twice. Idempotency makes the retry safe.

3. Two reasons gRPC is fast for internal service calls:

protobuf is smaller and quicker to parse than JSON text, and HTTP/2 lets many calls share one connection with streaming. gRPC is explicitly multi-language.

✍️ Drill: pick a protocol for two call sites

(a) A public mobile app fetches a user's profile. (b) Your order service calls your inventory service 50× per checkout. Which style for each, and why? Decide first.

Model answer: (a) REST/JSON over HTTPS — public, diverse clients, cacheable, human-debuggable; ubiquity matters more than microseconds. (b) gRPC — internal, you own both ends, high call volume, want compact binary payloads and low latency; streaming helps if inventory checks batch. Bonus: make the inventory deduction idempotent so a retried call can't double-decrement stock.

Rubric: ✓ REST at the public edge ✓ gRPC internally ✓ justifies by control/volume/latency ✓ remembers idempotency on the retried internal call.

Key takeaways

RPC makes a remote call look local via auto-generated stubs that marshal/unmarshal arguments.
The illusion leaks: remote calls are slow, can time out, can fail mid-flight, and need an agreed wire format.
A timeout ≠ failure — design for idempotency so retries are safe.
gRPC = protobuf (compact binary) + HTTP/2 (multiplexed streaming); great inside a system, REST for the public edge.