API Design

Security · Lesson 02

Transport Layer Security (TLS)

Every byte your API sends travels through infrastructure you don't own — ISPs, cloud backbones, corporate proxies. TLS is the sealed, tamper-evident envelope that ensures those bytes arrive exactly as sent and only readable by the intended recipient.

⏱ 12 min Difficulty: core Prereq: CIA triad (sec-01)

By the end you'll be able to

What problem does TLS actually solve?

Imagine you slide a note under a colleague's door, but the hallway has twenty people who can pick it up, read it, change it, or swap it entirely before it arrives. That is plain HTTP. TLS (Transport Layer Security) seals the note inside an opaque, numbered envelope. Anyone in the hallway can see a sealed envelope travelling — they cannot read it, they cannot change it without the seal breaking, and the envelope carries proof of who sealed it.

Those three guarantees map directly to the CIA triad:

TLS guaranteeCIA propertyHow TLS achieves it
ConfidentialityConfidentialitySymmetric encryption (AES-GCM) — the payload is unreadable without the session key
IntegrityIntegrityMessage Authentication Code (MAC) on every record — tampered bytes are detected and the connection is dropped
Server authenticationIntegrity + ConfidentialityThe server presents a certificate signed by a trusted Certificate Authority — the client knows it is talking to the real server, not an impostor

Note what TLS does not provide by default: it does not authenticate the client (mutual TLS, mTLS, adds that), and it says nothing about what the server does with your data once it arrives.

The TLS handshake — a simplified view

Before any application data flows, the client and server run a handshake to agree on a shared secret and verify the server's identity. TLS 1.3 (the current standard) does this in one round-trip. Here is the essential story:

  1. ClientHello — the client announces which cipher suites and TLS versions it supports, plus a random value it generated.
  2. ServerHello + Certificate — the server picks a cipher suite, sends its own random value, and presents its certificate (its identity card).
  3. Key exchange — both sides derive the same shared session key from the two random values and an ephemeral key exchange (Diffie-Hellman). The session key never travels on the wire.
  4. Finished — both sides send a verification message encrypted with the new session key, proving the handshake was not tampered with.
  5. Application data — from this point, all payload is encrypted with the session key.
Client Server ClientHello (supported ciphers, client random) ServerHello (chosen cipher, server random) Certificate (server identity, signed by CA) Key exchange (client DH share) Server Finished (encrypted) Client Finished (encrypted) 🔒 Application data (encrypted with session key) Cert signed by Intermediate CA → Root CA
TLS 1.3 completes the handshake in one round-trip. The session key is derived, never transmitted. After "Finished," all bytes are encrypted and MAC-protected.

Certificates and the chain of trust

How does the client know the certificate is genuine and not forged? It checks the certificate chain:

  1. The server's certificate is digitally signed by an Intermediate CA (Certificate Authority).
  2. The Intermediate CA's own certificate is signed by a Root CA.
  3. The Root CA's certificate is pre-installed in the client's operating system or browser trust store.
  4. The client verifies each signature up the chain. If everything checks out and the root is trusted, the server's identity is confirmed.

A certificate contains the domain name it's valid for, an expiry date, and the public key the server will use. If the domain doesn't match or the cert is expired, the client refuses to continue.

# Inspect a certificate with openssl
openssl s_client -connect api.example.com:443 -servername api.example.com \
  </dev/null 2>/dev/null | openssl x509 -noout -text

# Key fields to check:
Subject: CN = api.example.com
Issuer:  CN = R11, O = Let's Encrypt, C = US
Not Before: 2025-01-01
Not After : 2025-04-01   # ← expires in 90 days (Let's Encrypt cadence)

# A curl that fails certificate validation (never do this in prod):
curl --insecure https://api.example.com/v1/health   # ← defeats the whole point

HTTPS = HTTP inside TLS

HTTPS is not a different protocol — it is plain HTTP carried inside a TLS tunnel. Every header, URL, body, and status code is identical; TLS just wraps the bytes before they leave the network card. This means all the HTTP mechanics you know work unchanged; you just get the three guarantees for free.

TLS termination at a gateway

In production, a load balancer or API gateway typically handles the TLS handshake on behalf of your service. The encrypted connection ends (terminates) at the gateway, which then forwards the request to your service over your internal network — usually plain HTTP or a separate internal TLS connection.

Scenario Internet client → [TLS] → Gateway (terminates TLS) → [plain HTTP or mTLS] → Service A

This is intentional: it centralises certificate management, lets the gateway inspect and route requests, and offloads crypto from your application. The security question is: what happens on the internal hop?

⚠️ Common trap

Two pitfalls that will end your on-call sleep:

1. Expired certificates. A cert that expires at 3 AM on a Sunday will take your entire API offline — curl and every SDK will refuse to connect. Automate renewal (Let's Encrypt + certbot, or your cloud provider's managed certificates) and alert when a cert has less than 30 days remaining.

2. TLS termination then plaintext internally. If your internal network is shared with untrusted workloads (a multi-tenant cloud segment, a compromised container), forwarding unencrypted traffic internally re-exposes all three CIA properties you just bought. Use mutual TLS (mTLS) on internal hops in zero-trust architectures, or at minimum ensure the internal network is strictly isolated.

🎯 Interview angle

If asked "what does HTTPS give you?" go beyond "encryption." Say: "Three things — confidentiality via symmetric encryption, integrity via MAC on every record, and server authentication via the certificate chain. It does not authenticate the client by default — for that you add mTLS or application-layer tokens." That answer shows you understand the security model, not just the buzzword.

✅ Do this, not that

Do use managed certificate services (AWS ACM, GCP-managed certs, Let's Encrypt with auto-renewal) and set calendar reminders 30 days before manual cert expiry. Don't use --insecure / verify=False in any code that touches production — it silently removes all three TLS guarantees, turning HTTPS into plain HTTP with extra steps.

Under the hood: how it actually works

TLS 1.3 completes the entire handshake in one round trip (1-RTT): the client sends its first message, the server replies with everything needed to derive keys and authenticate itself, and encrypted application data can flow immediately after — no extra back-and-forth. Here is every message, in order.

# ── Round trip 1: Client sends ────────────────────────────────────────── [Client → Server] ClientHello TLS version: 1.3 Cipher suites: TLS_AES_128_GCM_SHA256 TLS_AES_256_GCM_SHA384 TLS_CHACHA20_POLY1305_SHA256 client_random: 32 bytes of entropy key_share: client ephemeral ECDH public key (curve X25519) # ── Server replies in same round trip ─────────────────────────────────── [Server → Client] ServerHello Chosen suite: TLS_AES_256_GCM_SHA384 server_random: 32 bytes of entropy key_share: server ephemeral ECDH public key (curve X25519) # ── Both sides independently compute the shared secret ────────────────── shared_secret = ECDH(client_private, server_public) = ECDH(server_private, client_public) # commutative — same result handshake_traffic_secret = HKDF-SHA384(shared_secret, client_random, server_random) application_traffic_secret derived from handshake_traffic_secret [Server → Client] Certificate Contents: domain, server public key, CA signature, notBefore, notAfter Client checks: signature chain validates up to a trusted Root CA Subject Alternative Name matches the SNI hostname notAfter > now() [Server → Client] CertificateVerify Server signs the full handshake transcript with its long-term private key. Proves: the server holds the private key paired with the cert's public key. [Server → Client] Finished HMAC over complete handshake transcript, keyed by handshake_traffic_secret. Client verifies: confirms the handshake was not tampered with in transit. [Client → Server] Finished Same HMAC from the client side. Server verifies. # ── Application data begins ────────────────────────────────────────────── [Both] Derive application traffic keys from master secret. All subsequent data encrypted with AES-256-GCM.

Why ephemeral ECDH matters — Forward Secrecy. Each handshake generates a brand-new key pair that is discarded the moment the session ends. Even if an attacker records all ciphertext today and later steals the server's long-term private key, they cannot decrypt past sessions — the ephemeral keys that produced those session secrets no longer exist anywhere.

Certificate chain validation in detail. Your OS and browser ship with a pre-installed set of Root CA certificates (the trust store). The server must send its leaf certificate plus any intermediate CA certificates needed to bridge from the leaf to a Root CA. The client verifies: leaf signed by intermediate, intermediate signed by a root in the trust store. It also checks the Subject Alternative Name (SAN) extension — a list of hostnames the cert is valid for — against the hostname in the request. A mismatch here is a hard failure, even if the signature chain is otherwise valid.

How to debug & inspect it

When TLS misbehaves it usually manifests as one of a handful of error codes. These commands let you inspect the raw TLS handshake and certificate without writing any code.

# Connect and dump the full TLS handshake details $ openssl s_client -connect api.example.com:443 -servername api.example.com -tls1_3 # Look for these in the output: # SSL-Session: Protocol: TLSv1.3 # Cipher: TLS_AES_256_GCM_SHA384 # Server certificate: ... Verify return code: 0 (ok) # View the certificate: subject, issuer, dates, SAN $ openssl s_client -connect api.example.com:443 -servername api.example.com </dev/null 2>/dev/null \ | openssl x509 -noout -text | grep -A3 "Subject\|Issuer\|Not Before\|Not After\|DNS:" # Check how many days until expiry $ openssl s_client -connect api.example.com:443 </dev/null 2>/dev/null \ | openssl x509 -noout -enddate # Verify the full chain (prints each cert in the chain) $ openssl s_client -connect api.example.com:443 -showcerts </dev/null 2>/dev/null \ | grep -E "subject|issuer"
Symptom Cause Fix
SSL_ERROR_RX_RECORD_TOO_LONG / connection refused on port 443 Server is running plain HTTP on port 443 — TLS not configured Configure TLS on the server; check that the port is listening for TLS not HTTP
"certificate has expired" (browser) / verify error:num=10:certificate has expired (openssl) The leaf cert's notAfter date has passed Renew cert; set up automated renewal (certbot, ACM); alert 30 days before expiry
"hostname mismatch" / verify error:num=62 The cert's CN/SAN does not include the hostname being called Reissue the cert with the correct SANs; check if you're calling the wrong hostname (alias, internal vs external)
"unable to get local issuer certificate" / verify error:num=20 Intermediate CA cert missing from the server's cert chain — client can't build chain to the root Configure the server to send the full chain (leaf + all intermediates); test with openssl s_client -showcerts
"no shared cipher" / SSL_CTX_set_cipher_list errors Client and server have no cipher suite in common — often an old server rejecting modern clients or vice versa Update server TLS config to include TLS 1.3 suites; check ssl_protocols and ssl_ciphers in nginx/Apache config
verify=False / --insecure used in code Dev shortcut that disables certificate validation entirely — all three TLS guarantees are gone Fix the underlying cert issue; never disable verification in production

Checklist for debugging a TLS failure:

  1. Run openssl s_client -connect host:443 and read "Verify return code" — 0 means ok, anything else is the root cause.
  2. Check notBefore / notAfter for expiry.
  3. Check Subject Alternative Names match the hostname you are calling.
  4. Run with -showcerts to verify the full chain is served (leaf + all intermediates).
  5. Check server TLS config accepts TLS 1.2/1.3 and a modern cipher suite.

By the numbers

TLS handshakes add latency to every new connection. The governing formula is simple:

handshake_cost_ms = RTT × handshake_RTTs TLS 1.2: handshake_RTTs = 2 → cost = 2 × RTT TLS 1.3: handshake_RTTs = 1 → cost = 1 × RTT TLS 1.3 resumption (0-RTT): handshake_RTTs ≈ 0 → cost ≈ 0 (data piggybacked on first packet)

Scenario: a mobile API client at RTT = 80 ms to the server (typical cross-region). Each fresh connection pays this overhead before any application byte flows.

ProtocolHandshake RTTsCost at RTT = 80 msNotes
TLS 1.2 (new connection)2 RTTs160 msAdditional TCP SYN adds another 80 ms → total cold-start overhead = 240 ms
TLS 1.3 (new connection)1 RTT80 msFull handshake in one round trip; saves 80 ms vs 1.2 per new connection
TLS 1.3 session resumption (0-RTT)≈ 0 RTT≈ 0 msClient reuses a pre-shared session ticket; first request data sends immediately; replay-attack caveat applies to non-idempotent requests
Keep-alive (existing connection)0 RTT0 msNo handshake at all — connection already established; the dominant case on HTTP/1.1 keep-alive and HTTP/2

Fleet-scale trace — 1 M new connections/day at RTT = 80 ms:

connections_per_day = 1_000_000 RTT = 80 ms = 0.080 s # TLS 1.2: 2 RTTs per connection total_handshake_seconds_12 = 1_000_000 × 2 × 0.080 = 160,000 s ≈ 1.85 days of client wait # TLS 1.3: 1 RTT per connection total_handshake_seconds_13 = 1_000_000 × 1 × 0.080 = 80,000 s ≈ 0.93 days of client wait # Saving from 1.2 → 1.3: saved_seconds_per_day = 160,000 - 80,000 = 80,000 s/day saved_ms_per_conn = (2 - 1) × 80 ms = 80 ms per connection # TLS 1.3 resumption: near-zero handshake # Resumption rate of 80% of connections (typical returning users): resumable = 0.80 × 1_000_000 = 800,000 connections at ~0 ms cold = 0.20 × 1_000_000 = 200,000 connections at 80 ms total_wait_with_resumption = 200,000 × 0.080 = 16,000 s/day ← 10× better than TLS 1.2
StrategyConnections/dayTotal client handshake wait / daySaving vs TLS 1.2
TLS 1.2 (new every time)1 M160,000 s (~1.85 days)
TLS 1.3 (new every time)1 M80,000 s (~0.93 days)–80,000 s/day
TLS 1.3 + 80% resumption1 M16,000 s (~4.4 hours)–144,000 s/day (90% reduction)
TLS 1.3 + keep-alive (HTTP/2)New conns only (much fewer)MinimalDominant production strategy

Decision math: each additional new connection costs exactly 1 RTT (TLS 1.3) or 2 RTTs (TLS 1.2) of unavoidable latency. The break-even for whether to invest in session resumption is:

resumption_break_even: implementation_cost / (new_connections_per_day × RTT × (1 - resumption_rate)) At 1M connections/day, RTT=80ms, resumption rate 80%: each hour of eng time to enable resumption saves 80,000 s/day of user wait → if p50 user RTT = 50 ms, that is 333 ms saved per cold connection on average Decision: TLS 1.3 + session resumption + HTTP/2 keep-alive is the stack. Keep-alive eliminates handshake entirely for subsequent requests on the same connection — it dominates both resumption and protocol version for sustained traffic patterns.

Sources: RFC 8446 — TLS 1.3; Cloudflare — 0-RTT resumption; High Performance Browser Networking — TLS chapter (Grigorik).

🧠 Quick check

1. A developer adds requests.get(url, verify=False) to bypass a certificate error in their Python service. What security properties are now missing?

Without certificate validation, a man-in-the-middle can present a fake certificate. Once they control the TLS session, they can decrypt and modify all traffic — confidentiality, integrity, and authentication are all gone.

2. In TLS, the session key used to encrypt application data is:

The Diffie-Hellman key exchange lets both sides compute the same shared secret from their respective contributions. The secret itself is never transmitted, so intercepting the handshake does not give an eavesdropper the key.

3. Your API gateway terminates TLS and forwards requests to a microservice over HTTP on the internal network. The main security concern is:

TLS termination at the gateway is standard practice, but the internal plaintext hop is a risk in zero-trust or multi-tenant environments. Use mTLS internally or ensure the internal network is fully isolated and trusted.

✍️ Exercise: diagnose a TLS failure (try before opening)

A CI pipeline starts failing with SSL: CERTIFICATE_VERIFY_FAILED when calling your staging API. List three possible root causes and the first command you'd run to investigate each.

Model answer:

Root causeFirst command / check
Certificate expiredopenssl s_client -connect staging.example.com:443 </dev/null 2>/dev/null | openssl x509 -noout -dates
Self-signed cert not in trust storeopenssl s_client -connect ... 2>&1 | grep "Verify return code" — look for code 18 (self-signed) or 21 (unable to verify)
Hostname mismatch (cert issued for wrong domain)openssl x509 -noout -text ... | grep -A2 "Subject Alternative Name" — compare to the hostname being called

Rubric: ✓ three distinct categories of TLS failure ✓ each investigation uses an observable signal (openssl, curl -v) ✓ no "disable verification" as a fix.

Key takeaways

Sources & further reading