Reliability & Scale · Lesson 04
API Gateway Deep Dive
Microservices multiply the surfaces clients must talk to. An API gateway collapses them into a single front door — and that front door does far more than just route traffic. Understanding what lives in the gateway (and what doesn't) is one of the clearest ways to distinguish junior from senior API thinking.
By the end you'll be able to
- Name and explain every core responsibility of an API gateway and what would happen if each was removed.
- Precisely distinguish a gateway from a load balancer, a reverse proxy, and a service mesh — including where they overlap.
- Describe the Backend-for-Frontend (BFF) pattern and the gateway failure modes that must be designed against in production.
The single front door
Imagine a large hospital. Hundreds of departments — radiology, oncology, emergency, pharmacy — each with their own internal procedures, staff, and locations. The hospital's main reception is where all visitors arrive first. The receptionist checks your ID, confirms your appointment, routes you to the right department, translates your request ("I need an MRI" → "proceed to Building C, 3rd floor, room 312"), and keeps a log of everyone who came in. If reception closed, you'd have to know the internal layout, carry credentials for each department separately, and the hospital would have no central audit trail.
An API gateway is that reception desk for your services. Clients talk to one address. Everything behind it — identity verification, routing, quota enforcement, protocol translation, logging — lives in the gateway so it doesn't have to be duplicated in every service.
The formal definition: an API gateway is a single-entry-point reverse proxy that sits in front of a collection of backend services and implements shared cross-cutting concerns on behalf of those services.
Gateway responsibilities, one by one
1. Routing and dispatch
The gateway inspects the incoming request — the URL path, HTTP method, host header, or custom headers — and decides which backend service should handle it. A rule like "any request to /v1/orders/* goes to the orders service; /v1/users/* goes to the users service" is the fundamental routing table. More sophisticated gateways support traffic splitting (send 5% of /v1/search traffic to a new service version for canary deployments) and request mirroring (duplicate traffic to a shadow environment).
# Conceptual gateway routing table (Kong / NGINX style)
route users_service:
match:
path: "/v1/users"
method: ["GET", "POST", "PATCH"]
upstream: "http://users-service:8080"
strip_path: false
route orders_service:
match:
path: "/v1/orders"
method: ["GET", "POST"]
upstream: "http://orders-service:8080"
route canary_search:
match:
path: "/v1/search"
upstream:
- target: "http://search-v1:8080" weight: 95
- target: "http://search-v2:8080" weight: 5 # canary
2. Authentication and authorization offload
Instead of each service re-implementing "is this JWT valid? what permissions does this token grant?", the gateway verifies credentials once on every inbound request and either rejects the request early (401 Unauthorized, 403 Forbidden) or stamps the request with a verified identity header for downstream services to trust.
A common pattern: the gateway validates the Bearer JWT against a public key or introspects the token with an auth service, extracts the user ID and scopes, then adds trusted internal headers like X-User-Id: 42 and X-User-Scopes: read:orders write:orders. Downstream services read these headers without re-validating the token — they trust the gateway.
# Gateway auth middleware — pseudo-code
function auth_middleware(request):
token = extract_bearer(request.headers['Authorization'])
if not token:
return Response(401, { "error": "missing token" })
claims = verify_jwt(token, public_key=JWKS_URI)
if not claims:
return Response(401, { "error": "invalid token" })
required_scope = ROUTE_SCOPE_MAP[request.path]
if required_scope not in claims.scopes:
return Response(403, { "error": "insufficient scope" })
# Stamp trusted identity headers for downstream
request.headers['X-User-Id'] = claims.sub
request.headers['X-User-Scopes'] = claims.scopes
request.headers['Authorization'] = "" # strip raw token
return FORWARD(request)
3. Rate limiting
Covered in depth in the previous lesson, but the gateway is the canonical enforcement point: it sits before every backend service, can enforce per-key quotas using a shared Redis store, and can return HTTP 429 with Retry-After headers without any backend service being involved.
4. TLS termination
Clients connect to the gateway over HTTPS (TLS). The gateway terminates the TLS session — decrypts the traffic — then forwards the request to backend services over plain HTTP on the private internal network. This means backend services don't need to manage TLS certificates, and the gateway can apply SSL policies (minimum TLS version, cipher suite enforcement) in one place. Re-encrypting on the internal leg ("TLS passthrough" or "end-to-end TLS") is an option for high-security environments where even internal traffic must be encrypted.
5. Request and response transformation
The gateway can modify requests and responses in flight: add or remove headers, rewrite paths, reshape JSON bodies, convert query parameters to headers. This is used for API versioning (the gateway rewrites /v2/users to /v1/users?v=2 before the backend sees it), for hiding internal implementation details from the public API surface, and for injecting standard fields (request IDs, correlation headers).
# Request transformation examples
# 1. Inject a correlation ID on every inbound request
request.headers['X-Request-Id'] = generate_uuid()
# 2. Path rewrite: hide internal versioning from public API
# Public: GET /v1/products/123
# Internal: GET /api/catalog/product?id=123
if request.path.startswith('/v1/products/'):
product_id = request.path.split('/')[-1]
request.path = "/api/catalog/product"
request.query['id'] = product_id
# 3. Response transformation: add deprecation warning
if request.path.startswith('/v1/'):
response.headers['Deprecation'] = "true"
response.headers['Sunset'] = "2026-12-31"
6. Aggregation and composition
A single client request may need data from multiple backend services. Instead of forcing the client to make three separate calls (and incur three round-trips over a mobile connection), the gateway can fan-out the three requests in parallel, merge the responses, and return one combined payload. This is sometimes called the "API composition" or "aggregator" pattern and is closely related to the Backend-for-Frontend pattern described below.
7. Response caching
The gateway can cache upstream responses for cacheable resources (GET requests with appropriate Cache-Control headers). Subsequent identical requests are served from the gateway's cache without hitting the backend, reducing latency and backend load. This is safe only for idempotent, read-only operations and requires careful cache key design (include auth headers if responses are user-specific).
8. Protocol translation
Clients might speak REST over HTTP/1.1 while backend services use gRPC (HTTP/2 + Protocol Buffers) or WebSockets. The gateway translates between protocols. For example, AWS API Gateway can expose a REST endpoint that internally invokes a Lambda function — the gateway handles the HTTP→Lambda invocation translation completely transparently to the client.
9. Observability: logging, tracing, and metrics
Because every request passes through the gateway, it's the ideal place to emit a single structured log line per request, start a distributed trace span, and increment request-count, latency, and error-rate metrics — for every service simultaneously. Services don't need their own logging middleware for the standard fields (path, method, status, latency, client ID).
The Backend-for-Frontend (BFF) pattern
A general-purpose API gateway serves all clients — web, mobile, partner integrations. But different clients have very different needs: a mobile app wants compact responses with minimal fields to save bandwidth; a desktop web app wants richer payloads with nested objects; a third-party partner wants a different versioning and auth model. Serving all of them from one generic API means every response is a compromise.
The Backend-for-Frontend (BFF) pattern solves this by creating one gateway per major client type. Each BFF is a thin aggregation and translation layer tailored to exactly one consumer. The "backend services" (users, orders, inventory) remain generic and unchanged; the BFF composes, filters, and reshapes their responses for its specific client.
The BFF pattern is typically owned by the frontend team — the same team that owns the mobile app owns the mobile BFF, which means they can change the API contract without coordinating with every other team. The pattern introduces a maintenance burden (N gateway codebases) but unlocks independent evolution per client.
Gateway vs. load balancer vs. reverse proxy vs. service mesh
These four components often appear together in architecture diagrams and are frequently confused in interviews. They overlap in capability, but each has a distinct primary job and reason for existence.
Comparison table
| Component | Primary job | Operates at | Knows about | Real examples | Does NOT typically do |
|---|---|---|---|---|---|
| Reverse proxy | TLS termination, DDoS mitigation, static caching | L7 HTTP (north-south) | HTTP requests, hostnames, URLs | NGINX, Caddy, Cloudflare | Business auth logic, API quotas per user |
| API Gateway | Single entry point: auth, rate limiting, routing, transform | L7 HTTP/gRPC (north-south) | API keys, JWT claims, routes, quotas, versions | AWS API Gateway, Kong, Apigee, Traefik | Health-based instance selection, TCP load distribution |
| Load balancer | Distribute connections across healthy instances | L4 TCP or L7 HTTP | Server health, connection counts, response times | AWS ALB/NLB, HAProxy, GCP Cloud LB | Auth, quotas, response transformation |
| Service mesh | Secure, observable east-west traffic between microservices | L4/L7 (east-west) | Service identity (mTLS), circuit breakers, retries, traces | Envoy, Istio, Linkerd, Consul Connect | Client-facing auth, API versioning, external routing |
A Layer 7 load balancer (like AWS ALB) can inspect HTTP and do path-based routing — which looks like a gateway. But a load balancer's job is instance selection: among the healthy instances of a service, which one gets this connection? It has no concept of API keys, user quotas, JWT validation, or response transformation. An API gateway's job is cross-cutting policy: is this caller allowed? how many calls have they made? what format does the response need to be in? They often appear stacked: ALB in front for L4 availability, gateway behind it for L7 policy.
Reverse proxy = "protect my server from the raw internet." Load balancer = "spread connections across healthy instances." API gateway = "enforce policy for all my APIs in one place." Service mesh = "secure and observe how my services talk to each other." These four jobs rarely compete; they stack.
Real gateway products
| Product | Type | Key characteristic | Best for |
|---|---|---|---|
| AWS API Gateway | Managed cloud | Native integration with Lambda, IAM, Cognito; pay-per-call pricing | Serverless APIs on AWS |
| Kong | Open source + enterprise | Plugin architecture (auth, rate limit, transform as first-class plugins); runs on Kubernetes | On-premise or multi-cloud; teams that need custom plugins |
| NGINX | Reverse proxy / gateway | High performance; lua/NJS scripting for custom logic; battle-tested at high load | High-throughput deployments; replacing Apache |
| Envoy | Proxy / service mesh data plane | Dynamic xDS config, first-class observability, HTTP/2 and gRPC native | Service mesh data plane (Istio); edge proxy for gRPC-heavy stacks |
| Traefik | Cloud-native reverse proxy | Auto-discovers routes from Kubernetes Ingress/CRDs; Let's Encrypt ACME built in | Kubernetes-native teams who want zero-config routing |
Gateway failure modes
Single point of failure
Because the gateway is the single front door, if it goes down, every service it fronts becomes unreachable — even services that are perfectly healthy. This is the fundamental availability tax of the pattern. The mitigation: run the gateway in a highly available (HA) cluster. Most managed gateways (AWS API Gateway, Cloudflare) handle this for you. Self-hosted gateways (Kong, NGINX) require you to run multiple instances behind a load balancer with health checks, automatic instance replacement, and a rolling upgrade strategy.
# Minimal HA gateway topology (Kubernetes example)
# Kong running as a Deployment with ≥2 replicas
apiVersion: apps/v1
kind: Deployment
metadata:
name: kong-gateway
spec:
replicas: 3 # min 2 for HA; 3 for maintenance safety
strategy:
rollingUpdate:
maxUnavailable: 0 # zero-downtime rolling upgrade
maxSurge: 1
template:
spec:
affinity:
podAntiAffinity: # spread across nodes — don't co-locate replicas
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
Added latency
Every request through the gateway adds one extra network hop: client → gateway → service instead of client → service directly. For an internal microservice-to-microservice call inside a data center, this is typically 0.2–1 ms — negligible. For edge deployments where the gateway is geographically distributed, latency can be near zero if the gateway is co-located. The concern is when the gateway itself is slow: a sluggish auth plugin, an overloaded rate-limit Redis, or an expensive transformation can add 10–50 ms to every request. Monitor gateway-added latency (total latency minus upstream latency) as a dedicated metric and alert on it.
Configuration drift
A gateway's power comes from its routing table and policy configuration. As services evolve, old routes that point to deprecated or renamed services can linger in the gateway config, causing mysterious 404s or routing to the wrong service. Treat gateway configuration as code: version it in git, review changes, and run integration tests against the routing table on every deployment.
Two of the most common questions in system design and infrastructure interviews. For "what does a gateway do?" — don't just say "routing." Walk through all 9 responsibilities and explain the unifying theme: the gateway centralizes cross-cutting concerns so services don't have to each implement them. For "gateway vs. load balancer" — the mental model is everything: the load balancer selects which instance handles a request; the gateway decides whether and how the request is permitted and transformed before it ever reaches an instance. They solve different problems and are typically both present, stacked.
Under the hood: one request through every gateway stage
Describing what a gateway does is easy. Understanding in what order, with what data structure, and which errors originate where is what lets you debug a production incident. Here is a single HTTPS GET /v1/orders/99 traced through every stage of a Kong-style gateway. The numbers on the left are approximate wall-clock microseconds from connection accept.
Two details that matter for debugging: (a) the gateway strips the raw Authorization token before forwarding — the upstream service never sees the original credential; it trusts only the gateway-stamped X-User-Id and X-User-Scopes headers. (b) the X-Request-Id injected at step ⑤ is the correlation handle that lets you join the gateway access log to the upstream service log to a distributed trace span — you must log it consistently in every service.
# Minimal Kong-style route + plugin config (declarative YAML)
services:
- name: orders-svc
url: http://orders-service:8080
routes:
- name: orders-route
service: orders-svc
paths: ["/v1/orders"]
methods: [GET, POST, PATCH]
plugins: # applied in this order on each request
- name: jwt # ② AuthN — reject 401 if token invalid
config:
key_claim_name: sub
claims_to_verify: [exp]
- name: rate-limiting # ③ Rate limit — reject 429 if exceeded
config:
second: 50
minute: 1000
policy: redis
- name: request-transformer # ⑤ Req transform
config:
add:
headers: ["X-Request-Id:$(uuid)"]
remove:
headers: [Authorization]
How to debug & inspect it
Gateway errors split cleanly into two buckets: gateway-generated (the gateway itself produced the error before touching any upstream) and upstream-proxied (the upstream returned a non-2xx and the gateway forwarded it, possibly translated). Mixing them up wastes hours. The fastest separator is the response body and the presence of upstream-specific headers.
Use the X-Request-Id (or whatever correlation header your gateway injects) to pivot from the gateway access log to the upstream log:
Distinguishing a gateway-generated 502/504 from an upstream error:
| Symptom | Cause | Fix |
|---|---|---|
502 with gateway-branded body, no upstream log entry | Gateway could not connect to the upstream at all (DNS failure, upstream down, port wrong) | Check the upstream host/port in the gateway route config; verify the upstream service is running; check network policy |
502 with gateway-branded body, upstream log shows a crash or 5xx | Upstream crashed or returned malformed HTTP (e.g. missing status line) | Fix the upstream application bug; the gateway is just reporting faithfully |
504 with gateway-branded body | Upstream did not respond within the gateway's read timeout | Profile the upstream endpoint; increase the gateway timeout if justified; add caching upstream of the slow query |
401 with gateway-branded body, no upstream log | Token failed JWT validation in the gateway — upstream was never called | Decode the token (jwt decode $TOKEN), check exp, check the algorithm, check the JWKS endpoint the gateway uses |
429 with gateway-branded body, no upstream log | Rate-limit quota exhausted in gateway (Redis counter hit ceiling) | Inspect X-RateLimit-Remaining and Retry-After headers; check the rate-limit plugin config; tune quotas or add burst allowance |
404 with gateway-branded body (not upstream 404) | No route matched in the gateway routing table | Run kong routes list (or equivalent); verify the path prefix and method match exactly; check for trailing-slash mismatches |
Debug checklist for gateway incidents:
- Capture the
X-Request-Id(ortraceparent) from the failing response — this is your pivot key. - Check the gateway access log for that ID: note upstream latency vs. total latency. If upstream latency is absent or zero, the error was gateway-side (auth, rate limit, no route).
- If the upstream was called, search the upstream service log for the same request ID and read the upstream error.
- For 502/504: test direct connectivity from the gateway pod to the upstream with
curlornc— rules out DNS/network issues independent of the gateway config. - For auth 401: decode the JWT header and payload (
echo $token | cut -d. -f1,2 | base64 -d); checkexp,alg, and issuer match what the gateway is configured to accept. - For 429: confirm which rate-limit counter is exhausted — per-user vs. per-IP vs. global — and whether the Retry-After header is being honored.
In production: how leading APIs do it
The gateway landscape splits into two camps: managed cloud gateways that handle availability, scaling, and certificate management for you, and self-hosted gateways that give you full control at the cost of operational burden. Every major architecture at scale has converged on centralising the same cross-cutting concerns — the differences are in deployment model and extension mechanism.
| System | Type | What it handles |
|---|---|---|
| AWS API Gateway | Managed cloud | Routing; token-bucket throttling with configurable burst and rate; usage plans + API keys for per-consumer quotas; authorizers (Lambda custom, Cognito user pools, or native JWT); request/response mapping templates (Velocity); stages + canary deployments (traffic split by percentage); response caching with configurable TTL; WAF integration for IP-based rules and managed rule groups. |
| Kong | Self-hosted (NGINX/OpenResty core) | Plugin model where every cross-cutting concern — authentication, rate limiting, request/response transformation, logging, CORS — is a first-class plugin applied per route or globally. Declarative configuration via YAML (deck); Kubernetes-native via the Kong Ingress Controller. Enterprise edition adds OIDC, RBAC, and a developer portal. |
| Envoy | L7 proxy / service-mesh data plane | Dynamic xDS API for configuration (no restart required); routing with retries, circuit breaking, and outlier detection; first-class HTTP/2 and gRPC support; rich observability via stats and tracing sinks. Used as the data-plane sidecar in Istio and as a standalone edge proxy. Does not come with a management UI — typically configured by a control plane. |
| Netflix Zuul / Spring Cloud Gateway | JVM edge routing + filter chain | Netflix open-sourced Zuul as a filter-chain edge router that handled auth, dynamic routing, and resilience for hundreds of services. Spring Cloud Gateway is the Spring-ecosystem successor, using a predicate/filter model. Both illustrate the pattern of edge routing + filters at JVM scale, and Netflix's tech blog documents the architectural decisions in detail. |
| Cloudflare / Apigee | Managed edge | Cloudflare Workers and API Shield operate at the CDN edge: DDoS mitigation, rate limiting, bot management, and JWT validation happen before traffic reaches your origin. Apigee (Google Cloud) adds a full developer portal, analytics, and monetization layer targeted at enterprise API programs. |
The common thread. Every system in this table — regardless of vendor, deployment model, or underlying technology — implements the same architectural insight: a gateway centralises cross-cutting concerns so individual services do not have to reimplement them. Authentication, rate limiting, TLS termination, and request routing appear in every gateway because these concerns affect every API call and have no business living in individual services. When a payment service also validates JWTs and enforces quotas, you have N copies of that logic to keep in sync, N places for a security misconfiguration, and N deployment targets every time a policy changes. Moving those concerns to the gateway reduces them to one. The managed vs. self-hosted distinction changes who operates the gateway — it does not change the architectural pattern.
AWS API Gateway's usage-plan documentation, Kong's plugin hub, Envoy's architecture overview, and the Netflix tech blog on Zuul all describe the same decomposition from different angles. Each is worth reading once — the vocabulary differences are superficial; the structural decisions are identical.
🧠 Quick check
1. Your company runs 12 microservices. Each service currently validates JWTs independently using the same shared library. A new security requirement mandates key rotation every 24 hours. Which approach best solves this?
Auth offload to the gateway means the key rotation logic lives in exactly one place. Services don't need to be redeployed; only the gateway configuration changes. This is precisely the "centralize cross-cutting concerns" benefit of the gateway pattern.
2. An AWS Application Load Balancer (ALB) can do path-based routing — so it can route /v1/users to the users service and /v1/orders to the orders service. Does that make it an API gateway?
The ALB's primary job is distributing connections across healthy instances. Path-based routing is a convenience feature for instance selection, not a policy enforcement mechanism. An API gateway owns auth, quotas, transformation, and rate limiting — none of which ALB provides natively.
3. You run a self-hosted Kong gateway as a single instance. What is the first reliability concern to address?
A single gateway instance means the entire API surface depends on one process. A crash, OOM, or bad deployment takes down every service simultaneously — even perfectly healthy ones. Run at minimum 2 instances behind a load balancer with health checks. The latency added by a gateway is typically sub-millisecond on the internal network.
4. A mobile team complains that the public REST API returns too much data (they only use 3 of 40 fields) and forces them to make 4 separate calls to render one screen. Which pattern directly addresses this?
The BFF pattern creates a gateway layer tailored to the mobile client. The mobile BFF makes 4 parallel calls to the underlying services, merges them, strips the 37 unused fields, and returns a single compact response. The mobile team owns the BFF and can evolve it independently without touching any backend service.
🏗️ Exercise 1 — Design a gateway architecture for a multi-platform product
Questions to answer:
- Should you use a single shared gateway or the BFF pattern? Justify your choice.
- List 5 responsibilities the gateway should own that are currently duplicated across the 8 services.
- What is the single biggest risk introduced by adding a gateway layer, and how do you mitigate it?
- The mobile app needs responses with ≤5 fields; the web dashboard needs the same endpoint to return 30+ fields. How does your gateway design handle this?
Model answer:
- BFF pattern. Three distinct client types with fundamentally different auth models (partner REST uses API keys + OAuth; mobile uses device tokens; web uses session cookies), different payload requirements, and owned by different teams — BFF is the right call. A single shared gateway would become a compromise layer that serves no client well and creates cross-team coordination overhead.
- JWT/API key validation; rate limiting; request/response logging; TLS termination; correlation ID injection. These five are currently copied across all 8 services.
- Single point of failure. Mitigation: run each BFF as a multi-instance deployment (≥2 replicas) behind a load balancer, with health checks and automated restart. Use a managed gateway where possible to offload availability guarantees.
- The mobile BFF fetches the full response from the relevant service and strips it to the 5 required fields before returning. The web BFF fetches the same endpoint and returns the full payload. Two BFFs, same upstream service, different response shapes — neither service changes.
Rubric: ✓ BFF vs. single gateway decision with justification ✓ At least 4 cross-cutting concerns named ✓ SPOF identified + HA mitigation ✓ Response shaping per client explained at the BFF layer. Hitting all 4 = strong answer.
🔍 Exercise 2 — Gateway vs. load balancer vs. service mesh distinction
Questions:
- Identify at least two redundancies or misassignments of responsibility in this stack.
- Redraw (in words) a leaner version that eliminates the redundancy.
- What does Istio provide that the other layers cannot?
Model answer:
- Redundancy 1: Both Envoy and NGINX can do TLS termination; having both in sequence means TLS is terminated and re-established unnecessarily, adding latency and complexity. Redundancy 2: Both Envoy and ALB can do L7 path-based routing; running both in sequence doubles the routing config surface and adds another hop. NGINX as a dedicated reverse proxy between Envoy and ALB adds a third hop with no added value.
- Leaner stack: Cloudflare/CDN (DDoS, anycast, DDoS) → Envoy as API gateway (TLS termination, auth, rate limiting, path routing) → backend services directly (Envoy routes to service instances; Envoy supports health-based upstream selection). Inside the cluster, Istio sidecars handle east-west mTLS. This removes NGINX and ALB entirely, leaving 2 instead of 4 network hops for north-south traffic.
- Istio (running Envoy sidecars as a service mesh) handles east-west mTLS — encrypted, authenticated communication between services inside the cluster. None of the other layers (CDN, API gateway, load balancer) operate in the east-west path. Istio also provides circuit breaking, retry policies, distributed tracing, and traffic shifting for service-to-service calls without modifying service code.
Rubric: ✓ Both redundancies identified ✓ Leaner architecture removes at least one hop ✓ Envoy/Istio east-west vs. north-south distinction correct. Hitting all 3 = strong answer.
Key takeaways
- An API gateway is a single front door that implements shared cross-cutting concerns (auth, rate limiting, routing, TLS, logging, transformation) so individual services don't have to.
- The 9 core responsibilities are: routing, auth offload, rate limiting, TLS termination, request/response transformation, aggregation, response caching, protocol translation, and observability.
- The Backend-for-Frontend (BFF) pattern creates one gateway per major client type, allowing each to evolve independently with a payload shape tailored to its consumer.
- A load balancer selects which instance handles a connection; an API gateway enforces whether and how the request is processed. They solve different problems and typically stack.
- A service mesh handles east-west (service-to-service) traffic inside the cluster; an API gateway handles north-south (client-to-service) traffic from outside.
- The gateway's biggest risk is being a single point of failure. Mitigate with HA deployment (≥2 instances, rolling upgrades, health checks).
- The gateway adds a network hop — monitor gateway-added latency as a dedicated metric; alert if it exceeds your SLO budget.
- Real examples: AWS API Gateway (managed, serverless-native), Kong (plugin-extensible, K8s-native), NGINX (high-performance reverse proxy), Envoy (gRPC/HTTP2-native, service mesh data plane).
Sources & further reading
- AWS API Gateway — Developer Guide
- Kong Gateway — Official documentation
- NGINX — HTTP proxy module (reverse proxy configuration)
- Envoy Proxy — What is Envoy?
- microservices.io — API Gateway pattern (Chris Richardson)
- microservices.io — Backend-for-Frontend (BFF) pattern
- Google Cloud — gRPC, OpenAPI and REST: understanding protocol translation