API Design

Design Case Studies · Lesson 07

Design: Video Streaming API

Uploading a 4K film and streaming it to a billion simultaneous viewers are two completely different problems — yet a video platform must solve both in the same system. This case study traces the full pipeline from the creator's browser through transcoding workers to the viewer's adaptive player, exposing the API design decisions that make each stage work at scale.

⏱ ~18 min Difficulty: advanced Prereq: Caching (rel-07), Pub/Sub (rel-10), File Upload (cs-02)

By the end you'll be able to

Requirements

Before drawing any boxes, pin down what the system must do. Video platforms are deceptively complex because they sit at the intersection of three very different workloads: large-file ingest, CPU-intensive batch processing, and globally distributed read-heavy delivery. Getting requirements wrong means you optimize for the wrong bottleneck.

Functional requirements

Non-functional requirements

Design decisions

Every major decision in this system comes with a "why" — and for each one, there is a common alternative that fails at scale. Interview panels reward candidates who explain what breaks before explaining what they chose.

Decision 1: Presigned URLs for upload (not API-proxied upload)

Naively, a creator POSTs their video to your API server, which writes it to S3. This works for 10 MB profile photos. For a 20 GB film, it means every byte travels through your API fleet twice: once inbound (creator → API server → object storage) and once outbound when served. Your API servers become the bottleneck, your egress bill doubles, and a single slow upload ties up a connection slot.

The fix: the API server issues a presigned URL — a time-limited, signed S3/GCS endpoint — and returns it to the client. The client uploads directly to object storage, bypassing the API server entirely. The API server sees only the tiny metadata request and the final "upload complete" callback. See the File Upload case study (cs-02) for the full presigned URL pattern including resumable chunked uploads.

Decision 2: Async transcode pipeline (not synchronous in-request)

Transcoding a 4K video takes minutes of wall-clock time and gigabytes of scratch disk. Doing it synchronously — holding the HTTP connection open until the job finishes — would mean upload requests time out, clients implement complex retry logic, and a transcode backlog crashes your API. Instead, completing the upload triggers an event on a queue. Transcode workers pull jobs, process in parallel, and emit completion events when done. The API responds immediately with a 202 Accepted and a job handle. Clients poll or receive a webhook. See the Event-driven & Pub/Sub lesson (rel-10) for the queue mechanics.

Decision 3: CDN delivery for segments (not direct origin serving)

Once transcoded, video segments are immutable bytes that never change. CDN edge nodes are designed precisely for serving immutable, cacheable content to geographically distributed audiences. Serving segments from your origin for even 1% of requests would require a network and cost investment that rivals the CDN itself. The CDN handles the read surge; the origin only exists to fill cache misses. See the Caching lesson (rel-07) for CDN cache semantics including Cache-Control: public, max-age=31536000, immutable for segment files.

Decision 4: HLS/DASH manifests for adaptive streaming

A plain MP4 download requires the client to buffer the entire file before seeking reliably. Adaptive streaming solves three problems at once: bandwidth adaptation (drop from 1080p to 360p mid-stream when the network degrades), fast start (buffer only the first 2–4 segments before playing), and seeking (jump directly to the segment containing the target timestamp). HLS uses a .m3u8 playlist; MPEG-DASH uses an XML .mpd manifest. Both index the same underlying segments.

Decision 5: Webhook + polling for transcode status

Clients need to know when a video is ready. Two mechanisms work together: polling for creator dashboards that can tolerate a GET request every 5 seconds, and webhooks for server-to-server integrations that want push notification. Never make the client wait on a long-poll — transcode jobs take minutes and connection timeouts make long-polling unreliable for this duration.

The API model

Six endpoints cover the full lifecycle. Note that the HLS manifest and segments live on the CDN domain — they are not routes on your API server.

POST /v1/videos — initiate upload

The creator sends metadata; the API returns a video ID and a presigned upload URL. No video bytes travel through the API server.

// Request
POST /v1/videos
Authorization: Bearer {token}
Content-Type: application/json

{
  "title":       "Climbing the Eiger: North Face",
  "description": "Solo ascent, summer 2025.",
  "content_type": "video/mp4",
  "file_size":    8472983040  // bytes; used to configure multipart upload
}

// Response 201 Created
{
  "video_id":     "vid_2wXk9mPqRn7v",
  "status":       "awaiting_upload",
  "upload_url":   "https://uploads.example-cdn.com/vid_2wXk9mPqRn7v?X-Amz-Signature=...",
  "upload_method": "PUT",
  "upload_expires_at": "2026-06-20T18:30:00Z"  // presigned URL TTL: 1 hour
}

PUT {upload_url} — chunked upload direct to object storage

The client PUTs to the presigned URL. For large files, S3 multipart upload allows up to 10,000 parts, each 5 MB–5 GB. The API server sees nothing.

// Multipart part (repeated for each 50 MB chunk)
PUT https://uploads.example-cdn.com/vid_2wXk9mPqRn7v?partNumber=3&uploadId=xKFdH...
Content-Length: 52428800
Content-Type: video/mp4

[binary chunk body]

// Response 200 from object storage
ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249"  // save for CompleteMultipartUpload

POST /v1/videos/:id/publish — trigger transcode

After all parts are uploaded and CompleteMultipartUpload succeeds, the client calls publish. This moves the video out of draft and enqueues the transcode job. Returns immediately with 202.

// Request
POST /v1/videos/vid_2wXk9mPqRn7v/publish
Authorization: Bearer {token}

// Response 202 Accepted
{
  "video_id":       "vid_2wXk9mPqRn7v",
  "status":         "transcoding",
  "transcode_job_id": "tjob_8RpQn3LvMz1w",
  "estimated_completion_at": "2026-06-20T17:25:00Z"
}

GET /v1/videos/:id — fetch video metadata

Returns the canonical video record: manifest URL, metadata, current transcode status, and available renditions once processing completes.

// Request
GET /v1/videos/vid_2wXk9mPqRn7v
Authorization: Bearer {token}

// Response 200 OK (video ready)
{
  "video_id":      "vid_2wXk9mPqRn7v",
  "title":         "Climbing the Eiger: North Face",
  "status":        "ready",
  "duration_s":    4320,
  "view_count":    142873,
  "manifest_url":  "https://cdn.example.com/vid_2wXk9mPqRn7v/manifest.m3u8",
  "thumbnail_url": "https://cdn.example.com/vid_2wXk9mPqRn7v/thumb.jpg",
  "renditions": [
    { "quality": "360p",  "bitrate_kbps": 400  },
    { "quality": "720p",  "bitrate_kbps": 2500 },
    { "quality": "1080p", "bitrate_kbps": 5000 },
    { "quality": "4K",    "bitrate_kbps": 18000 }
  ]
}

GET /v1/videos/:id/transcode-status — poll async job

Fine-grained progress for creator dashboards. Separate from GET /videos/:id to avoid cache-busting the main resource — the main endpoint can be cached aggressively once the video is ready.

// Response 200 OK (in-progress)
{
  "job_id":       "tjob_8RpQn3LvMz1w",
  "status":       "transcoding",
  "progress_pct": 62,
  "current_pass": "1080p",
  "passes_done":  2,
  "passes_total": 4,
  "eta_seconds":  183
}

// Response 200 OK (complete)
{
  "job_id":       "tjob_8RpQn3LvMz1w",
  "status":       "complete",
  "progress_pct": 100,
  "completed_at": "2026-06-20T17:22:47Z"
}

GET /{id}/manifest.m3u8 — HLS manifest (CDN, not API server)

This is not an API route — it is a file served directly from the CDN origin bucket. The URL is returned in the video metadata; the player fetches it independently.

# HLS master playlist  —  served from cdn.example.com, not api.example.com
#EXTM3U
#EXT-X-VERSION:3

# Each variant stream points to a sub-playlist of 4-second segments
#EXT-X-STREAM-INF:BANDWIDTH=400000,RESOLUTION=640x360,CODECS="avc1.42c01e,mp4a.40.2"
360p/playlist.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720,CODECS="avc1.4d401f,mp4a.40.2"
720p/playlist.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080,CODECS="avc1.640028,mp4a.40.2"
1080p/playlist.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=18000000,RESOLUTION=3840x2160,CODECS="avc1.640033,mp4a.40.2"
4k/playlist.m3u8
Interview angle

When asked "design YouTube" in a system design interview, examiners want to hear three pipeline stages with specific justifications: (1) upload via presigned URL so your API fleet never touches video bytes; (2) async transcode via a job queue so the upload response is immediate and transcode workers scale independently; (3) CDN delivery with a target hit ratio above 95% so origin load is a rounding error. Bonus points: mention that the manifest file and segments use separate Cache-Control TTLs — manifests are short-lived (30 s) while segment files are immutable (max-age=31536000). Tying CDN hit ratio back to a latency budget (1st byte <50 ms from edge vs. ~200 ms from origin) demonstrates systems thinking beyond "add a CDN."

Diagrams

Client Browser / Mobile App POST /v1/videos → presigned URL API Server Object Storage S3 / GCS raw video + segments + manifests PUT (presigned, direct) Transcode Queue SQS / Pub/Sub object.created event Transcoder Workers FFmpeg, GPU fleet write segments CDN Cloudflare / Fastly global edge nodes cache-fill on miss origin pull (miss)
Upload pipeline: the API server issues a presigned URL and immediately steps aside. Object storage fires an event on upload completion; transcode workers consume jobs independently of the HTTP path. CDN cache-fills from object storage only on misses.
Player ABR algorithm measures throughput ABR Decision throughput > 5 Mbps? → request 1080p CDN Edge nearest PoP manifests: TTL 30 s segments: immutable hit → <20 ms GET manifest GET segment cache hit response Origin Object Storage source of truth all segments + manifests miss → ~150 ms cache miss fill + respond ~98% hit rate origin sees only 2% of requests
Playback pipeline: the player's ABR algorithm continuously measures download throughput and picks the highest quality tier that fits. CDN edge nodes serve 98%+ of segment requests without touching the origin. A cache miss to origin adds ~130 ms but is invisible to viewers due to buffer pre-fetching.
Pitfall: serving video from your API servers

Two mistakes appear so often in system design interviews that they are worth calling out explicitly.

Proxying video bytes through your API server is the first. Your API fleet has finite network bandwidth — typically a few Gbps per instance. A single 4K stream at 18 Mbps times 10,000 concurrent viewers = 180 Gbps. No API fleet handles that without becoming the single most expensive line item on your AWS bill. Object storage + CDN costs a fraction of the equivalent API fleet bandwidth.

Synchronous transcoding inside the upload handler is the second. Transcoding a 1-hour 4K video can take 20–40 minutes of CPU time. Holding the HTTP connection open for that duration means: every client needs to implement timeout/retry logic; a transcode backlog ties up API worker threads; a single slow job degrades every concurrent upload. Async queues exist precisely for workloads with unbounded wall-clock duration.

Tip: separate domains for API, upload, and CDN

Use three distinct hostnames for the three traffic types: api.example.com for your REST API, uploads.example.com (or a direct S3/GCS URL) for object storage ingest, and cdn.example.com for media delivery. This separation buys you several things: you can apply different rate limits and autoscaling policies per domain; CDN caching rules apply cleanly to cdn.example.com without accidentally caching API responses; and your TLS certificates, WAF rules, and CORS headers stay untangled. Never route media bytes through api.example.com — the second you do, your CDN configuration becomes impossible to reason about.

Evaluation & latency budget

CDN offload math

Video segments are the definition of cacheable: they are immutable (the bytes for the 00:02:00–00:02:04 window of a given rendition never change), they are large (a 4-second 1080p segment at 5 Mbps is ~2.5 MB), and they are popular at the head of the view distribution. Once a segment is in a CDN edge node's cache, every subsequent viewer in that region gets it without touching the origin.

Suppose a video has 1,000,000 views. Each view fetches an average of 30 segments (2 minutes of a 72-minute film). That is 30,000,000 segment requests. With a 98% CDN hit ratio:

The first viewer per region per segment pays the origin-fetch cost (~150 ms). All subsequent viewers pay edge cost (~15 ms). Popular videos have effective hit ratios above 99.5% because global distribution means even a video with 100M views fills each edge cache quickly.

Async transcode: why synchronous would fail

Consider a 2-hour 4K film. A single-pass transcode at real-time ratio of ~6:1 takes ~20 minutes per quality tier × 4 tiers = ~80 minutes of sequential work, or ~20 minutes in parallel. The HTTP keep-alive timeout on most load balancers is 60–300 seconds. The upload handler would time out before the first rendition finishes. Even with long-polling or WebSockets, holding a server connection for 20 minutes consumes one worker thread that cannot serve other requests. At 10,000 concurrent uploads, that is 10,000 idle-but-blocked threads — the exact scenario Node.js async queues were designed to avoid.

The async pattern trades a single complex HTTP response for a simple 202 + a job-status poll. The complexity moves out of the HTTP layer and into the queue worker, where it belongs.

Latency breakdown table

Request type Served from Typical latency (P50) Typical latency (P99)
HLS master manifest (first load) CDN edge (short TTL — 30 s) 18 ms 55 ms
Video segment (CDN hit) CDN edge (immutable, long TTL) 12 ms 35 ms
Video segment (CDN miss, origin fill) Object storage via CDN 155 ms 420 ms
GET /v1/videos/:id (metadata) API server + Redis cache 28 ms 90 ms
POST /v1/videos/:id/publish API server (writes DB + enqueues) 45 ms 140 ms

Back-of-envelope: view scale

YouTube serves roughly 1 billion hours of video per day. That is approximately 1.16 × 1012 seconds of playback per day, or ~13.4 million concurrent streams. At an average bitrate of 2 Mbps (mix of mobile 360p and desktop 1080p), that is ~26.8 Tbps of sustained CDN egress. No single origin data center can deliver that — CDN edge distribution is not an optimization, it is the architecture.

For a mid-scale platform targeting 10 million monthly active viewers with 30 minutes average watch time per day:

Under the hood: the transcode pipeline

The words "upload → transcode → CDN" hide a multi-stage pipeline with specific data structures at every handoff. Here is how each stage actually works.

Stage 1: chunked upload to object storage

The client uses S3 multipart upload. It calls CreateMultipartUpload to receive an uploadId, then PUTs each 50–500 MB chunk as a numbered part. The object storage returns an ETag (MD5 of the part bytes) for each part. When all parts are uploaded, the client calls CompleteMultipartUpload with the ordered list of (partNumber, ETag) pairs. Object storage atomically assembles the parts into a single object. If the connection drops mid-upload, the client resumes from the last successful part number — only that part's bytes need to be retransmitted.

# Multipart state machine
CreateMultipartUpload  →  uploadId = "xKFdH..."
PutObject part 1       →  ETag: "abc123"
PutObject part 2       →  ETag: "def456"
PutObject part N       →  ETag: "xyz789"
CompleteMultipartUpload(uploadId, [(1,"abc123"), (2,"def456"), ...(N,"xyz789")])
  → raw video object assembled atomically in object storage

Stage 2: job queue and transcode workers

Object storage fires an s3:ObjectCreated event (or equivalent) when CompleteMultipartUpload succeeds. This event is published to a queue (SQS, Cloud Pub/Sub). A transcode job record is created in a database table with this schema:

-- Transcode job record (simplified)
job_id        TEXT PRIMARY KEY,    -- "tjob_8RpQn3LvMz1w"
video_id      TEXT NOT NULL,
status        TEXT NOT NULL,       -- queued | processing | complete | failed
renditions    JSONB,               -- per-rendition progress: [{quality:"1080p", status:"done", pct:100}, ...]
attempt       INT DEFAULT 0,
max_attempts  INT DEFAULT 3,
worker_id     TEXT,               -- which machine owns this job
locked_until  TIMESTAMPTZ,        -- distributed lock; worker must renew or job is re-queued
created_at    TIMESTAMPTZ,
started_at    TIMESTAMPTZ,
completed_at  TIMESTAMPTZ

A transcode worker pulls the job, sets worker_id and locked_until = now() + 10min, and begins FFmpeg transcoding. If the worker dies without renewing the lock, a supervisor re-queues the job for another worker. This prevents lost jobs without a single coordinator.

Stage 3: producing the rendition ladder

Each worker runs one or more FFmpeg passes. A typical rendition ladder for a 1080p source:

RenditionResolutionVideo codecTarget bitrateSegment duration
360p640×360H.264 baseline400 kbps4 s
720p1280×720H.264 main2,500 kbps4 s
1080p1920×1080H.264 high5,000 kbps4 s
4K3840×2160H.265 / VP918,000 kbps4 s

After FFmpeg produces each rendition as a continuous stream, a segmenter (FFmpeg with -f hls or a dedicated tool like Bento4) cuts it into fixed-duration .ts or .mp4 fragments and writes a per-rendition playlist.m3u8. A segment naming convention like 1080p/seg_0000.ts, 1080p/seg_0001.ts, ... makes each segment addressable independently, enabling the CDN to cache individual 2.5 MB chunks rather than multi-gigabyte files.

Stage 4: writing the HLS master manifest

Once all rendition playlists are written to object storage, the worker assembles the master manifest (manifest.m3u8) that links them together. Each #EXT-X-STREAM-INF line carries the BANDWIDTH (bits/second) and RESOLUTION that the ABR algorithm uses to choose a rendition. The manifest is written last — writing it atomically "publishes" the video: before it exists, any player request 404s; after it exists, the full ladder is reachable.

Stage 5: how adaptive bitrate picks a rendition at playback

The player implements an ABR algorithm. A simple throughput-based algorithm works as follows:

  1. Player downloads segment N of the current rendition and measures actual download throughput (bytes received / time taken).
  2. It applies a safety factor (e.g. 0.8×) to avoid oscillation: safe_throughput = measured × 0.8.
  3. It scans the master manifest's BANDWIDTH values from highest to lowest and picks the first one at or below safe_throughput.
  4. For segment N+1 it requests the chosen rendition's next segment.
  5. If the buffer falls below a threshold (e.g. 4 s), it immediately steps down one quality tier regardless of throughput.

The player never interrupts playback during a quality switch — it finishes playing the buffered segments of the old rendition while fetching the new rendition's next segment. Because all renditions share the same GOP (group of pictures) alignment and segment duration, the switch point is seamless.

Worked transcode job trace

Timeline for a 10-minute 1080p upload (7 GB raw file, 4-worker parallel transcode):

T+0s Creator POSTs POST /v1/videos → gets video_id + presigned upload URL T+0–180s Client uploads 7 GB in 140 × 50 MB parts (direct to S3) T+181s CompleteMultipartUpload succeeds → S3 fires s3:ObjectCreated event T+182s API server receives event, creates job record (status=queued), enqueues job T+183s POST /v1/videos/vid_.../publish → 202 Accepted {transcode_job_id: "tjob_..."} T+184s Worker W1 claims job (status=processing, locked_until=T+10min) T+184–280s W1 runs FFmpeg: 360p rendition → 2,400 segments → writes 360p/playlist.m3u8 T+184–310s W2 runs FFmpeg: 720p rendition → 2,400 segments → writes 720p/playlist.m3u8 T+184–370s W3 runs FFmpeg: 1080p rendition → 2,400 segments → writes 1080p/playlist.m3u8 T+184–420s W4 runs FFmpeg: 4K rendition → 2,400 segments → writes 4k/playlist.m3u8 T+421s W4 (last to finish) writes master manifest.m3u8 → video is now "ready" T+421s job.status = complete; webhook fires to creator; video_id status = "ready" Total transcode wall time: ~4 min (parallel) vs ~17 min (sequential)

Operating & debugging it

The transcode pipeline has four independently observable stages. Most production issues fall into one of them.

Key metrics to monitor

MetricWhere to observeAlert threshold (example)
Queue depth (jobs waiting)SQS / Pub/Sub console; CloudWatch>500 jobs queued for >5 min → scale workers
Job age (oldest queued job)Custom metric: now() - created_at for status=queued>10 min → worker may be stuck or under-provisioned
Transcode failure rateStatus=failed jobs / total jobs; log-based metric>2% failure rate → investigate FFmpeg errors
Worker lock renewal failuresApplication logs for "lock expired, job re-queued"Any → worker OOM or crash
CDN hit ratioCDN analytics dashboard; X-Cache HIT/MISS header sampling<95% → check Cache-Control headers on segments
Origin segment request rateObject storage access logs; CloudFront origin requestsSpike → CDN miss storm, likely new viral video or TTL misconfiguration

Inspecting a stuck or failed transcode

$ curl -s https://api.example.com/v1/videos/vid_2wXk9mPqRn7v/transcode-status \ -H "Authorization: Bearer $TOKEN" | jq . { "job_id": "tjob_8RpQn3LvMz1w", "status": "failed", "attempt": 2, "max_attempts": 3, "error": "FFmpeg non-zero exit: 1 — Invalid data found when processing input", "renditions": [ {"quality": "360p", "status": "complete"}, {"quality": "720p", "status": "complete"}, {"quality": "1080p", "status": "failed", "error": "encoder overload"}, {"quality": "4K", "status": "skipped"} ] } # "Invalid data found when processing input" = corrupted or truncated source file # "encoder overload" = worker ran out of CPU/RAM mid-encode (check worker instance size)
# Verify a specific segment exists and is cacheable $ curl -I https://cdn.example.com/vid_2wXk9mPqRn7v/1080p/seg_0001.ts HTTP/2 200 cache-control: public, max-age=31536000, immutable x-cache: HIT content-type: video/MP2T # If x-cache: MISS on a popular segment → segments not being written with correct Cache-Control # Inspect the master manifest $ curl -s https://cdn.example.com/vid_2wXk9mPqRn7v/manifest.m3u8 #EXTM3U #EXT-X-VERSION:3 #EXT-X-STREAM-INF:BANDWIDTH=400000,RESOLUTION=640x360 360p/playlist.m3u8 # If manifest.m3u8 returns 404, transcode never completed (all renditions must finish first)
SymptomLikely causeFix
Job stuck in queued for >5 minNo available workers; queue backlog; workers crashedScale out worker fleet; check worker logs for OOM or crash; verify queue subscription is active
Job repeatedly fails with "lock expired"Worker machine is too slow or OOM; lock TTL too shortIncrease worker instance size; extend lock TTL; ensure lock renewal is running on a background thread
Manifest 404 after job shows "complete"Worker wrote manifest to wrong path; object storage replication lagCompare expected vs actual manifest key in object storage; check worker path-generation code
CDN hit ratio drops suddenlyCache-Control header removed from segments; CDN config change; new viral video warming edge cachescurl -I a segment URL and check cache-control; new viral video miss storm resolves itself quickly
Player stalls at quality switchesGOP misalignment between renditions; segment duration mismatchEnsure all renditions use the same keyframe interval and segment duration in FFmpeg flags
FFmpeg "Invalid data" errorCorrupted or truncated upload; incomplete CompleteMultipartUploadRe-upload the source file; verify all part ETags before calling CompleteMultipartUpload
⚠️ Gotcha: writing the manifest before all renditions finish

If your worker writes manifest.m3u8 as soon as the first rendition completes, CDN edges worldwide will immediately cache a manifest that references rendition playlists that do not yet exist. Players will fetch those playlists and get 404s for every quality tier above the one that finished first. The master manifest must be written atomically after all rendition playlist.m3u8 files are confirmed present in object storage. In a multi-worker setup, use a distributed counter or barrier: each worker atomically decrements a "renditions remaining" counter; the worker that brings it to zero writes the manifest.

Quiz

🧠 Quick check

Q1: Why use presigned URLs for video upload instead of routing the bytes through your API server?

The API server's network capacity is sized for JSON requests, not multi-gigabyte binary transfers. A presigned URL lets the client write directly to object storage — the API server issues only a small signed token, then steps completely out of the data path. Your API fleet stays available for the workload it was sized for.

Q2: Why is adaptive streaming (HLS/DASH) preferable to serving a single MP4 file for video playback?

Adaptive bitrate streaming continuously monitors download throughput and switches between renditions mid-playback — seamlessly stepping from 1080p down to 360p when a mobile viewer moves into weak signal, then back up again. A single MP4 at a fixed bitrate either buffers constantly on a slow connection or under-serves a fast viewer who could get better quality.

Q3: Your video platform achieves a CDN segment cache hit ratio of 98%. What does that mean for your origin servers?

A 98% hit ratio means 98 out of every 100 segment requests are served directly from CDN edge caches without reaching the origin at all. Your origin fleet only processes the remaining 2% — typically the first viewer per region per segment. This is what allows a platform with 10 million concurrent viewers to run on a modest origin cluster rather than a data-center-scale serving fleet.

Practice: design the transcode job schema and completion webhook

You are designing the internal job record that tracks a video transcode, and the webhook payload that fires when the job completes. A well-designed schema here makes retry logic, observability, and client integrations far simpler.

Part 1 — Transcode job record

Design the JSON schema for a transcode job stored in your database. Your schema must support:

Part 2 — Completion webhook payload

Design the JSON body sent to a registered webhook URL when the transcode job finishes (success or failure). Your payload must:

Rubric

Key takeaways

Sources & further reading