API Design

Security · Lesson 10

Threat modeling with STRIDE

Security reviews that happen after shipping find the same things every time — because attackers follow patterns. STRIDE is a forcing function that names those patterns before you write line one, so you spend engineering effort on the right locks.

⏱ 16 min Difficulty: advanced Prereq: All Security lessons

By the end you'll be able to

What threat modeling actually is

A threat model is a structured conversation about what could go wrong — before a system is fully built. It does not require formal tools or security specialists. It requires one discipline: look at your own system the way an attacker would, and do it while you still have room to change things.

The output is not a document. The output is a list of threats, each with an owner and a mitigation decision: fix it, accept it, transfer it (e.g. to your cloud provider), or acknowledge the residual risk. Anything without a decision is just anxiety.

Think of it like a fire drill run before the building is occupied. No one has been hurt yet, but you walk every exit to confirm it opens. If a door is jammed, you fix the door — not the evacuation plan.

Trust boundaries and data-flow thinking

Before naming threats, you need a map of your system from the attacker's perspective. The key concept is the trust boundary: a line in your architecture where data crosses from a context you control to one you don't — or from one trust level to another.

Common trust boundaries in an API:

Every time data crosses a boundary, ask: can an attacker influence what crosses, intercept it, or replay it? That question, run for each boundary, surfaces the threats STRIDE will name.

Untrusted internet Client app (browser / mobile) Attacker (same network) App tier (your perimeter) API Gateway rate limit · TLS term Auth service JWT validation Transfer service POST /transfers · balance checks · audit log Data tier Accounts DB balances · tx history Audit log append-only TB-1 TB-2 TB-3 TB-1: every STRIDE threat is possible here — untrusted input TB-2: lateral movement, privilege escalation · TB-3: SQL injection, direct DB access
Trust-boundary data-flow diagram for a payments API. Each numbered boundary (TB) is a place where threats concentrate. STRIDE gives you a checklist to run at each one.

STRIDE: six threat categories, six questions

STRIDE was developed at Microsoft in 1999 and remains the most practical mnemonic for API threats because each letter maps directly to a property your system needs to preserve — and an attack class that violates it.

Letter Threat Property violated API example Mitigation
S Spoofing Authentication Attacker sends requests with a stolen session cookie or a forged JWT claim ("sub":"admin"). Strong authN: short-lived JWTs with signature validation, MFA, HttpOnly cookies. (→ sec-05, sec-08, sec-09)
T Tampering Integrity MITM strips HTTPS, modifies a POST /transfers body in transit to change the recipient account. TLS everywhere, HSTS, request signing for critical operations. (→ sec-02)
R Repudiation Non-repudiation A user claims they never issued a transfer; no server-side evidence contradicts them. Append-only audit log with actor ID, timestamp, request fingerprint, and action detail. Never delete audit rows.
I Information disclosure Confidentiality Error response leaks an internal stack trace with DB table names and credentials in the message. Generic error responses to clients; structured internal logs. Input validation to block injection. (→ sec-03)
D Denial of service Availability Attacker floods POST /transfers with 50 000 req/s, exhausting DB connection pool for legitimate users. Rate limiting per user and per IP at the gateway, connection pool limits, circuit breakers, timeout budgets.
E Elevation of privilege Authorization A regular user adds "role":"admin" to their own JWT claims — server trusts it because it forgets to validate the signature. Enforce authZ on every endpoint; validate JWT signatures server-side; least-privilege: each service account has only what it needs. (→ sec-05, sec-08)

Worked example: STRIDE on POST /transfers

Take a single endpoint — POST /transfers in a payments API — and run STRIDE systematically. This is what a real threat-modeling session produces, in condensed form:

# Endpoint: POST /transfers
# Request body: { "from": "acc_1", "to": "acc_2", "amount_cents": 5000 }
# Trust boundary crossed: public internet → app tier (TB-1)

──────────────────────────────────────────────────────────────────
 STRIDE    Threat                              Mitigation
──────────────────────────────────────────────────────────────────
 Spoofing  Stolen Bearer token reused after   Short expiry (15 min) +
           the legitimate user logs out.       denylist on logout. (sec-08)

 Tamper    Body intercepted; "to" changed     TLS 1.3 + HSTS preload
           to attacker account in transit.     on all environments. (sec-02)

 Repudia   User denies authorising transfer;  Append-only ledger row: actor,
           no log to contradict them.          timestamp, IP, signed request.

 Info disc  DB constraint error bubbles up:   Catch all DB exceptions; return
           "duplicate key: transactions(id)"  generic 500 with trace-id only.
           — reveals schema detail.            Log full error internally. (sec-03)

 DoS       Attacker hammers endpoint until     Per-user rate limit: 10 req/min.
           DB pool exhausted for others.       Global limit: 1000 req/min.
                                               Circuit breaker on DB pool.

 Elevate   User forges "from":"acc_admin"      Server validates that JWT sub
           to transfer from another user's     matches the "from" account owner
           account.                            before processing. (sec-05)
──────────────────────────────────────────────────────────────────

When and how to run it

The highest-leverage time is during design — before any code is written. A 90-minute whiteboard session with the feature author, a backend engineer, and one person playing devil's advocate covers most surfaces for a new endpoint. Run it again whenever the trust-boundary map changes: new external integration, new user role, new data sensitivity level.

  1. Draw the data-flow diagram. Boxes for processes and data stores, arrows for data flows, dashed lines for trust boundaries. Five minutes on a whiteboard. If you can't draw it, you don't understand the system yet.
  2. List the elements at each trust boundary. For each arrow that crosses a dashed line: what data, from whom, to whom.
  3. Run STRIDE per element. For each data flow, ask all six questions. Not all six will apply — that's fine. Write down the ones that do and their mitigations.
  4. Decide on each threat. Fix, accept (with explicit rationale), transfer (cloud provider handles TLS), or flag for later. Any threat without a decision is unfinished work.
  5. Track it. A threat model that lives only in a meeting is dead by Thursday. A two-column table (threat → decision) in the design doc survives code reviews and onboarding.
🎯 Interview angle

"How would you threat-model this API?" is a senior-level system design question. The interviewer is not checking whether you know STRIDE by name — they're checking whether you think systematically about adversaries rather than just happy paths. Start by drawing the trust boundaries: "Before I list threats, let me sketch who sends data to what, and where it crosses from untrusted to trusted." Then walk S–T–R–I–D–E at one boundary. That structure — diagram first, then enumerate — shows engineering discipline and will distinguish you from candidates who improvise a list of vague security concerns.

⚠️ Common traps

Only modeling the happy path. A threat model built around "the authenticated user does the intended operation" is not a threat model — it's a spec. The whole point is to ask what happens when the caller is hostile, the data is malformed, the upstream is slow, or the token was stolen. Force yourself to phrase threats as attacker goals: "an attacker who wants to X could do so by Y."

Treating it as a one-time audit. Systems change. Every time you add a new external integration, expand a role's permissions, or store a new category of personal data, the trust-boundary map changes and so does the threat surface. A threat model is a living document, not a checkbox. Schedule a review when the diagram changes.

✅ Do this, not that

Do treat mitigations as references to specific controls already in your architecture ("rate limiting at the gateway", "JWT expiry 15 min"), not aspirational notes ("add security later"). Don't file every threat as "fix it" — explicit acceptance with a rationale ("we accept the DoS risk on this internal debugging endpoint because it's network-isolated and the attacker would already need VPN access") is a legitimate engineering decision. A threat model that marks everything as critical is useless.

Under the hood: how it actually works

Reading about STRIDE is not the same as running a session. This section shows you the mechanics of an actual 90-minute threat modeling meeting, then traces a complete worked example at a level of detail that lets you run one yourself.

Running a STRIDE session: the four artifacts you produce

  1. Data-flow diagram (DFD). Draw boxes (processes and data stores) connected by labeled arrows (data flows). Mark every dashed boundary line where trust changes. Rule: if you cannot draw it, you do not understand the system. This takes 10–15 minutes and surfaces architecture you hadn't articulated yet.
  2. Element inventory. List every arrow that crosses a trust boundary. For each: what data type, from whom, to whom, is it authenticated, is it encrypted in transit.
  3. STRIDE threat table. For each element in the inventory, run the six questions. Write every threat as a concrete attacker action ("attacker replays captured token within 15-minute window") not a vague category ("token reuse"). Assign a severity: High / Medium / Low.
  4. Decision register. For every threat: Fix, Accept, Transfer, Defer — with an owner and due date. This is the only artifact that must survive the meeting.

Fully worked threat model: POST /transfers

The endpoint accepts a money transfer request from an authenticated user. Trust boundary crossed: TB-1 (untrusted internet → app tier). Here is every STRIDE threat enumerated with the concrete attack and chosen mitigation:

STRIDE Concrete threat (attacker goal → method) Severity Mitigation Decision
Spoofing Attacker steals Alice's JWT (e.g. from XSS on another page) and replays it within its 15-minute validity window to initiate a transfer from Alice's account. High Short JWT expiry (15 min); token denylist entry on logout; device fingerprint or IP binding for high-value operations; MFA step-up for transfers above threshold. Fix — JWT expiry already set; add denylist on logout (owner: Auth team, Sprint 4).
Tampering MITM on a network that intercepts the TLS handshake (e.g. corp proxy with CA injection) modifies "to_account" from "acc_7890" to attacker's account before it reaches the server. High TLS 1.3 with HSTS preload (prevents downgrade); certificate pinning for mobile clients; request-body HMAC for ultra-high-value transactions. Fix — HSTS already deployed; certificate pinning deferred (owner: Mobile team, Q3).
Repudiation User initiates transfer, then calls support claiming they never authorised it. Without a log that ties the request to their authenticated identity, there is no evidence to contradict them. Medium Append-only transaction ledger record containing: JWT sub, source IP, device ID, timestamp (microsecond), full request body hash, response status. Ledger rows are immutable — no UPDATE/DELETE. Ship to append-only WORM store. Fix — basic log exists; add request body hash and immutability guarantee (owner: Backend, Sprint 5).
Information disclosure DB constraint violation on duplicate transaction ID bubbles up as a 500 with body {"detail": "duplicate key value violates unique constraint \"transactions_pkey\""}, revealing the primary key column name and that the transactions table exists. Low Catch all DB/runtime exceptions at the service boundary; return {"error": "internal_error", "trace_id": "abc-123"} to the caller; log the full error internally tagged with the same trace ID. Fix — add exception handler middleware (owner: Backend, Sprint 3).
Denial of service Attacker script submits 50,000 transfer requests per minute from rotating IPs, exhausting the DB connection pool (max 100 connections) and making the service unavailable for legitimate users. High Rate limit: 10 req/min per authenticated user, 60 req/min per IP at the gateway. DB connection pool with max 80 connections and a 2-second wait timeout (fail fast). Circuit breaker trips at 50% error rate over 30s window. Fix — per-user rate limit deployed; per-IP limit in gateway backlog (owner: Infra, Sprint 4).
Elevation of privilege Attacker modifies the from_account field in the request body to an account they do not own ("from": "acc_0001" — the CEO's account). The server uses the JWT sub for authentication but does not verify the caller owns from_account before debiting it. High Object-level authorization: before processing, assert jwt.sub == accounts.owner_user_id WHERE id = request.from_account. Return 403 if not matched. This is OWASP BOLA / IDOR prevention. (→ sec-05) Fix — add ownership check at service layer (owner: Backend, Sprint 3, P0).

How the DFD maps to the threat table

Every row in the table above corresponds to a specific arrow crossing TB-1 in the data-flow diagram. The discipline is: for each crossing arrow, you ask all six STRIDE questions. If you skip an arrow, you get a blind spot. The worked example above focuses on the single arrow "Client app → API Gateway / Transfer service." In a full session you would repeat the exercise for every arrow in the diagram — including internal ones like "Transfer service → Accounts DB" (where SQL injection, direct DB access without auth, and connection pool exhaustion live).

How to debug & inspect it

Threat modeling findings need to become tests and monitoring, otherwise they evaporate after the meeting. This section shows how to operationalize each STRIDE category so you can verify mitigations are actually working.

# S — Spoofing: verify JWT denylist is enforced on logout $ TOKEN=$(curl -s -X POST https://api.acme.com/v1/auth/login \ -d '{"email":"test@acme.com","password":"pw"}' | jq -r .access_token) $ curl -s -H "Authorization: Bearer $TOKEN" https://api.acme.com/v1/account {"id":42,"email":"test@acme.com"} $ curl -s -X POST -H "Authorization: Bearer $TOKEN" https://api.acme.com/v1/auth/logout {"status":"logged_out"} $ curl -s -H "Authorization: Bearer $TOKEN" https://api.acme.com/v1/account {"error":"token_revoked"} # must be 401, not 200 — confirms denylist works # T — Tampering: verify HSTS is deployed (no HTTP fallback) $ curl -I http://api.acme.com/v1/transfers HTTP/1.1 301 Moved Permanently Location: https://api.acme.com/v1/transfers # must redirect to HTTPS; never return 200 on HTTP $ curl -I https://api.acme.com/v1/transfers | grep -i strict strict-transport-security: max-age=31536000; includeSubDomains; preload # R — Repudiation: verify audit log captures a transfer $ curl -s -X POST https://api.acme.com/v1/transfers \ -H "Authorization: Bearer $TOKEN" \ -d '{"from":"acc_1","to":"acc_2","amount_cents":500}' {"id":"txn_abc","status":"completed"} $ psql -c "SELECT actor_sub, source_ip, body_hash, created_at FROM audit_log WHERE ref='txn_abc';" actor_sub | source_ip | body_hash | created_at 42 | 203.0.113.5 | sha256:2cf24dba5f… | 2026-06-20 10:30:00.123 # I — Info disclosure: confirm generic errors are returned $ curl -s -X POST https://api.acme.com/v1/transfers \ -H "Authorization: Bearer $TOKEN" \ -d '{"from":"acc_1","to":"acc_1","amount_cents":500}' {"error":"internal_error","trace_id":"7f3a9b"} # must NOT contain "duplicate key", table name, or stack trace # D — DoS: verify rate limiting trips $ for i in $(seq 1 15); do curl -s -o /dev/null -w "%{http_code}\n" \ -H "Authorization: Bearer $TOKEN" \ -X POST https://api.acme.com/v1/transfers \ -d '{"from":"acc_1","to":"acc_2","amount_cents":1}'; done 201 201 201 201 201 201 201 201 201 201 429 429 429 429 429 # 11th+ request must 429 (limit: 10/min per user) # E — Elevation: verify BOLA check (transfer from an account you don't own) $ curl -s -X POST https://api.acme.com/v1/transfers \ -H "Authorization: Bearer $TOKEN" \ -d '{"from":"acc_9999","to":"acc_2","amount_cents":1}' {"error":"forbidden"} # must be 403 — acc_9999 not owned by this JWT sub

Turn each of the above into a test in your integration/contract test suite so regressions are caught before they ship.

STRIDE findingTest typeMonitoring signal (production)
Spoofing — revoked token acceptedIntegration test: logout then replay token, assert 401Alert on any 200 from a token whose jti appears in the denylist table
Tampering — HTTP not redirectingSmoke test: HTTP GET → assert 301 + Location is HTTPSMonitor HSTS header presence; alert if missing from any response
Repudiation — audit row missingIntegration test: submit transfer, assert audit row exists with correct fieldsAlert if transaction count diverges from audit_log row count over 5-minute window
Info disclosure — stack trace in responseUnit test: inject DB error, assert response body contains only error + trace_idLog-scan alert on any production response body matching /Exception|stack trace|column|relation/
DoS — rate limit not enforcedLoad test: 15 requests in 30s per user token, assert 429 after the 10thAlert on p99 latency > 2s or DB connection pool utilization > 80%
Elevation — BOLA on from_accountIntegration test: transfer from another user's account, assert 403Alert on any 2xx for a transfer where from_account.owner != jwt.sub (anomaly detection rule)

Threat model review checklist:

  1. Is there a DFD with labeled trust boundaries? If you can't point to TB-1, TB-2, etc. in a diagram, the session is incomplete.
  2. Has every arrow crossing a trust boundary been enumerated in the threat table?
  3. Does every threat have a decision (Fix / Accept / Transfer / Defer) with an owner?
  4. Are "Accept" decisions accompanied by explicit rationale — not just "low priority"?
  5. Is there a test or monitoring signal for every "Fix" decision?
  6. Is the threat model in source control (or linked from the design doc) so it survives onboarding?
  7. Has a review been scheduled for the next time the trust boundary map changes?

🧠 Quick check

1. A user edits their own JWT payload to change "role":"user" to "role":"admin". Which STRIDE category does this attack fall under?

The attacker is not impersonating another user (Spoofing) or intercepting data in transit (Tampering) — they're expanding their own permissions beyond what the system granted. That's Elevation of Privilege. The mitigation is server-side signature validation: a JWT whose payload was modified will fail the HMAC/RSA check.

2. An error response includes "detail": "relation \"payments\" does not exist". Which STRIDE category is violated?

The DB error message leaks the table name "payments", which tells an attacker the schema structure and may hint at injection opportunities. That's a confidentiality failure — Information disclosure. Generic error messages with opaque trace IDs fix this.

3. At what point in the development lifecycle does threat modeling deliver the most value?

A threat found in design costs a conversation and a diagram change. The same threat found in a pen-test costs a code freeze, a retroactive design change, and rescheduled releases. Value is highest when discovery triggers the cheapest possible fix.

4. A transfer service has no audit log. Which STRIDE property is directly missing?

Repudiation threats are countered by non-repudiation controls — primarily audit logs. Without a log, a user can deny authorising a transfer and you have nothing to check against. Authentication tells you who called; the audit log proves what they did.

✍️ Exercise: threat-model a password-reset endpoint

You are designing POST /v1/auth/password-reset/request which accepts an email address and, if an account exists, sends a reset link. Run STRIDE against this endpoint: for each letter, name one concrete threat and one mitigation. You may accept a threat if you can justify it.

Model answer:

Spoofing:       Attacker triggers reset for a victim's account, then
                intercepts the link (e.g. open-redirect, email forwarding).
  Mitigation:   Reset tokens are single-use, short-lived (15 min), and
                bound to the requesting IP. Invalidate on use.

Tampering:      Link in email contains user ID in plain text;
                attacker edits it to target another account.
  Mitigation:   Token is opaque and random (not the user ID). Map token →
                user server-side. Never embed identity in the token.

Repudiation:    User claims they never requested a reset; attacker
                used it to lock them out.
  Mitigation:   Log: requester IP, timestamp, account, whether link was
                used. Notify account owner by email on request AND on use.

Info disclosure: Endpoint returns different response for existing vs.
                non-existing email → attacker enumerates user accounts.
  Mitigation:   Always return the same 200 response ("if an account
                exists, you'll receive an email"). Never distinguish.

DoS:            Attacker floods with 10 000 different emails, spamming
                users and exhausting the transactional email quota.
  Mitigation:   Rate-limit per IP: 3 req/min. CAPTCHA for anonymous
                callers. Per-account cooldown: 1 reset request per 5 min.

Elevation:      Accepted — this endpoint is unauthenticated by design;
                it cannot elevate privilege because the reset token grants
                only the ability to set a new password for one account,
                verified via the signed token. No admin paths involved.

Rubric: ✓ All six letters addressed ✓ Concrete threat (not vague "it could be attacked") ✓ Mitigation references a specific control ✓ At least one legitimate "accept" with justification ✓ Account-enumeration defence mentioned under I. Hitting four of five = solid; all five = ready for a production design review.

Key takeaways

Sources & further reading