Foundations · Lesson 06

How the Web works

"What happens when you type a URL and press Enter?" is the most-asked warm-up in tech interviews — because answering it well touches every foundation at once: addressing, DNS, the client-server model, connections, and HTTP. Let's trace the whole journey.

⏱ 11 minDifficulty: corePrereq: Lessons 03–05

By the end you'll be able to

Dissect a URL into its parts and say what each does.
Explain the client-server model and the role of DNS.
Narrate the full lifecycle of a web request end to end.

The client-server model

The Web runs on a simple deal: clients (your browser, a mobile app) request resources, and servers store and serve them. Clients start every conversation; servers wait and respond (the exact limitation WebSockets later work around, Lesson 09). A resource is anything addressable — an HTML page, an image, or a JSON response from your API.

Anatomy of a URL

A URL (Uniform Resource Locator) is the address of a resource. Every part has a job:

Scheme (how), host (which machine), port (which program), path (which resource), query (refinements). Recall host:port = the socket from Lesson 05.

DNS: turning a name into an address

Machines are reached by IP address (Lesson 03), but humans use names like api.example.com. DNS (Domain Name System) is the internet's phone book that translates one to the other. Before your browser can connect, it asks DNS "what's the IP for this host?" — and the answer is cached aggressively (in your OS, your browser, and resolvers along the way) because that lookup is on the critical path of every first request.

The full lifecycle of a request

Now assemble everything. Typing https://api.example.com/v1/users/42 and pressing Enter kicks off this chain:

The famous "what happens when you type a URL" answer, in six steps. Each maps to a foundation you've already learned.

DNS lookup — resolve api.example.com to an IP (or hit the cache).
TCP handshake — open a connection to that IP on port 443 (Lesson 05).
TLS handshake — negotiate encryption so the data is private (full TLS lesson later).
HTTP request — send GET /v1/users/42 with headers (Lesson 07 next).
Server work — auth, look up data, build the response.
HTTP response + render — status, headers, body come back; the client uses them.

🎯 Interview angle

This question is a breadth test. The strongest answers don't just list steps — they show the connections: "DNS is cached because it's on the critical path", "steps 1–3 are one-time setup, so I'd reuse connections", "the network round trips dominate the latency (Lesson 04), not the server's 5 ms of work." Weaving the foundations together is what earns the nod.

⚠️ Common trap

Forgetting DNS and connection setup and jumping straight to "the server returns HTML." Those early steps are exactly where real latency and real failures live (DNS outages have taken down huge chunks of the internet). Skipping them signals you've never debugged a slow first request.

✅ Web vs API, same machinery

An API call and a web page load follow the identical lifecycle — the only difference is step 6's body is JSON instead of HTML, and step 5 may skip rendering. Everything you know about "how the Web works" transfers directly to "how my API call works." That's the whole point of building on the Web's standards.

Under the hood: how it actually works

Each step of the lifecycle is a real protocol exchange you can observe with tools. Here is the full trace for https://api.example.com/v1/users/42.

Step 1 — DNS resolution with `dig`

Before any connection, the OS resolves the hostname. The OS checks its own cache first, then the local resolver (usually your router), then the ISP resolver, and finally walks the DNS hierarchy: root → TLD (.com) → authoritative nameserver for example.com. The result is cached at every layer with a TTL.

$ dig api.example.com +short 93.184.216.34 $ dig api.example.com +noall +answer api.example.com. 300 IN A 93.184.216.34 ; 300 = TTL in seconds — the OS cache holds this result for 5 minutes ; IN A = Internet, Address record (IPv4). AAAA = IPv6. $ dig api.example.com +noall +answer +ttlid ; Re-run immediately: TTL counts down — shows you're reading a cache hit

Step 2 — TCP handshake with `curl -v`

With the IP in hand, the OS opens a TCP connection. TCP requires a three-way handshake before any data flows: SYN → SYN-ACK → ACK. This costs one full network round trip.

$ curl -v https://api.example.com/v1/users/42 2>&1 | head -30 * Trying 93.184.216.34:443... * Connected to api.example.com (93.184.216.34) port 443 * ALPN: curl offers h2,http/1.1 ; ^ TCP connected; now TLS starts

Step 3 — TLS handshake with `openssl s_client`

After TCP, the client and server negotiate TLS. For TLS 1.3 this takes one additional round trip (down from two in TLS 1.2): the client sends ClientHello (with supported ciphers and a key share); the server responds with ServerHello + its certificate + a key share; both sides derive the session keys and the client sends the first encrypted request immediately.

$ openssl s_client -connect api.example.com:443 -brief Protocol version: TLSv1.3 Ciphersuite: TLS_AES_256_GCM_SHA384 Peer certificate: CN=api.example.com Hash used: SHA256 Signature type: ECDSA Verification: OK ; Check CN and Verification: OK to confirm certificate is valid for this host ; "depth=2" lines above it show the chain: leaf → intermediate → root CA

Step 4 — HTTP request/response with `curl -v`

Once the TLS tunnel is up, the actual HTTP message is sent inside it. curl -v prints the headers; --http2 forces HTTP/2.

$ curl -v --http2 https://api.example.com/v1/users/42 \ -H "Authorization: Bearer token123" 2>&1 > GET /v1/users/42 HTTP/2 > Host: api.example.com > Authorization: Bearer token123 > < HTTP/2 200 < content-type: application/json < cache-control: max-age=60 < {"id":42,"name":"Ada Lovelace"}

The complete sequence in one picture:

Approximate latency breakdown on a typical 20ms RTT connection. DNS and TLS are one-time costs per host; connection reuse eliminates them for subsequent requests.

How to debug & inspect it

When a URL doesn't load, map the failure to the step that failed. The tools below let you isolate each layer independently.

# 1. Does DNS resolve? $ dig api.example.com +short 93.184.216.34 # success ; (empty output or NXDOMAIN) → hostname doesn't exist or DNS is broken # 2. Can you reach the server on TCP? $ curl -v --max-time 5 https://api.example.com 2>&1 | grep -E "Trying|Connected|Failed" * Trying 93.184.216.34:443... * Connected to api.example.com (93.184.216.34) port 443 * connect to 93.184.216.34 port 443 failed: Connection refused # port closed # 3. Is TLS valid? $ openssl s_client -connect api.example.com:443 -brief 2>&1 | grep Verif Verification: OK Verification error: certificate has expired # 4. Does the HTTP request succeed? $ curl -o /dev/null -s -w "%{http_code} time=%{time_total}s\n" https://api.example.com/v1/users/42 200 time=0.182s 503 time=0.012s # server returned an error

Symptom	Which step failed	Likely cause	Fix / check
`dig` returns `NXDOMAIN` or empty	DNS (step 1)	Hostname wrong, DNS record deleted, or outage at nameserver	Check for typos; verify `NS` records; try `dig @8.8.8.8` to bypass local resolver
Connection refused / timed out	TCP (step 2)	Server isn't listening on that port, firewall rule blocking, server down	`curl -v` → "Connection refused" vs "timed out" (refused = fast reject; timeout = firewall drop)
"certificate has expired" or "SSL handshake failed"	TLS (step 3)	Cert expired, wrong CN/SAN, self-signed without trust, clock skew	`openssl s_client -connect host:443` → read the cert dates and CN; check server clock
`4xx` HTTP status	HTTP app layer (step 4)	404 = path wrong; 401 = missing/bad auth; 400 = malformed request; 429 = rate-limited	Add `-v` to curl; read response body for error message
`5xx` HTTP status	Server work (step 5)	Backend crash, database down, unhandled exception	Check server logs; not a client issue — escalate to the API owner
First request slow, rest fast	DNS/TCP/TLS overhead (steps 1–3)	Cold start; one-time setup cost	Expected — connection reuse fixes it; investigate if every request is slow (keep-alive disabled?)

Debug checklist:

dig hostname +short — does it resolve? If empty/NXDOMAIN, the problem is DNS.
curl -v --max-time 5 https://hostname — does TCP connect? Watch for "Trying … Connected" vs "refused/timeout".
openssl s_client -connect hostname:443 -brief — is the TLS certificate valid? Check Verification and expiry date.
curl -o /dev/null -s -w "%{http_code}\n" URL — what HTTP status does the server return?
If it works in curl but not the browser, check CORS headers (Lesson sec-04). If curl also fails, the problem is in steps 1–5 above.

🧠 Quick check

1. In the client-server model, who initiates the conversation?

Clients initiate; servers wait and respond. That one-directional default is exactly what WebSockets (Lesson 09) loosen for real-time use.

2. DNS exists to:

DNS is the internet's phone book: name → IP. It's heavily cached because it sits on the critical path of the first request.

3. Why is the first request to a fresh host slower than later ones?

Steps 1–3 (DNS + TCP + TLS) are one-time per connection. A kept-alive connection jumps straight to the HTTP request, so subsequent calls are much faster.

✍️ Drill: the classic "type a URL" question

Give a 60-second answer to "what happens when you type https://example.com and hit Enter?" that an interviewer would rate senior. Try before opening.

Model answer: "First the browser resolves example.com to an IP via DNS, usually from cache since DNS is on the critical path. It opens a TCP connection to that IP on port 443, then runs the TLS handshake to encrypt the channel. Now it sends an HTTP GET for /. The server authenticates if needed, gathers data, and returns a status, headers, and body. The browser renders it — and crucially, steps 1–3 are one-time setup, so the network round trips dominate latency and I'd reuse the connection for follow-up requests." Bonus: mention caching/CDN to cut the propagation delay (Lesson 04).

Rubric: ✓ DNS → TCP → TLS → HTTP → server → render, in order ✓ notes setup is one-time / connection reuse ✓ ties in latency or caching. Hitting the connections, not just the list, is the senior signal.

Key takeaways

The Web is client-server: clients request, servers respond.
A URL = scheme · host · port · path · query — host:port is the socket.
DNS maps names → IPs and is heavily cached because it's on the critical path.
A request's lifecycle: DNS → TCP → TLS → HTTP → server work → response/render; the first three are one-time setup.
An API call uses the same machinery — just JSON instead of HTML.