Platform & API Product Engineering · Lesson 07
Platform & third-party auth (installed apps, Connect)
Basic OAuth lets one user authorize one app. But when a marketplace app is installed into 10,000 merchant stores, a new problem emerges: how does a single application credential authenticate against thousands of independent accounts — each with its own permission scope, its own revocation lifecycle, and its own blast radius?
By the end you'll be able to
- Explain the GitHub App credential chain: app JWT → installation access token, including why each is short-lived.
- Describe the Stripe Connect model and articulate how "act on behalf of" works without a per-account token.
- Design a token management strategy for 10,000 installs that avoids cold-start thundering herds and handles revocation cleanly.
Why this is different from basic OAuth
Standard OAuth (Lesson sec-06) solves the user-delegation problem: a single user grants your app access to their account and receives a token scoped to their data. The session is 1:1 — one user, one token, one account.
The installed-app model is fundamentally different. Think of a Shopify app that a merchant installs into their store. The same app binary, running on the same servers, needs to make authenticated API calls into 10,000 separate merchant accounts — each with different permissions granted during install, each capable of revoking independently at any time. The authorization is 1:N, not 1:1.
Three distinct patterns have emerged across the industry to solve this. They differ in where the credential lives, how tokens are scoped, and what "acting on behalf of" looks like on the wire.
Pattern A — the GitHub App model (per-install tokens)
GitHub Apps introduced a credential model that cleanly separates two identities: the app identity (who is this third-party application?) and the installation identity (which account did this specific install happen into?). These correspond to two separate tokens with different lifetimes and scopes.
The app JWT
The app holds an RSA private key (generated when creating the app, uploaded as a PEM). To prove it is the app, it mints a short-lived JWT signed with RS256:
# GitHub App JWT — payload structure
{
"iss": "123456", // your GitHub App's numeric ID
"iat": 1750416540, // issued-at, set 60 s in the PAST (clock-skew buffer)
"exp": 1750417140 // issued-at + 600 s = 10-minute expiry (GitHub's hard max)
}
# Sign with RS256 using the PEM private key
import jwt
app_jwt = jwt.encode(payload, private_key_pem, algorithm="RS256")
The 10-minute lifetime is not arbitrary — it bounds how long a stolen app JWT is useful. GitHub verifies it using the public key you uploaded during app creation; because only you hold the private key, no one else can mint a valid app JWT.
The installation access token
The app JWT proves the app's identity, but it cannot call GitHub resources directly. To access a specific installation (a specific org or user that installed the app), the app exchanges its JWT for a per-install token:
# Exchange app JWT for an installation access token
POST https://api.github.com/app/installations/{installation_id}/access_tokens
Authorization: Bearer {app_jwt}
Accept: application/vnd.github+json
# Response — scoped to THIS installation only
{
"token": "ghs_16C7e42F292c6912E7710c838347Ae178B4a",
"expires_at": "2026-06-20T12:10:00Z", // 1 hour from now
"permissions": {
"contents": "read", // only what this installation granted
"issues": "write"
},
"repository_selection": "all"
}
The installation token carries only the permissions the user granted during installation — not the full set the app requested in its manifest. If the user granted contents:read but not pull_requests:write, the returned token reflects that. This makes the blast radius of a leaked token bounded to that installation's granted scope.
The 1-hour lifetime means your system must refresh tokens continuously. For 10,000 installs, that is a background process, not a one-time setup. See "By the numbers" below.
Pattern B — user OAuth tokens within an app
Installation tokens authenticate the app acting autonomously. But sometimes your app needs to act as a specific user — opening pull requests on their behalf, posting comments attributed to them, or accessing resources only that user can see. For this, GitHub Apps support a standard OAuth flow (sec-06) layered on top of the installation.
The user grants a user-level token via a redirect flow. That token carries the user's identity and only the permissions they consented to (which may be a subset of the installation's permissions). The key distinction: actions taken with a user token appear as that user in audit logs and comply with the user's own permission model, not the app's installation permissions. Use user tokens when the action should be attributed to a person; use installation tokens when the app is operating autonomously (scheduled syncs, webhooks, background jobs).
Pattern C — Stripe Connect (platform key + account header)
Stripe Connect takes a fundamentally different approach. Rather than issuing per-account tokens, Stripe lets your platform use its own secret key with a special header indicating which connected account to operate on:
# Stripe Connect — acting on behalf of a connected account
POST https://api.stripe.com/v1/charges
Authorization: Bearer sk_live_YourPlatformSecretKey
Stripe-Account: acct_1ExampleConnectedAcct
Content-Type: application/x-www-form-urlencoded
amount=2000¤cy=usd&source=tok_visa
There is no token exchange, no per-account credential to manage, and no refresh cycle. Stripe's backend uses the Stripe-Account header to scope the operation to that connected account's data and settings. The charge appears in that account's dashboard, not yours.
The trade-off is blast radius: your platform secret key is a single credential that, if leaked, gives an attacker the ability to operate against every connected account on your platform. The power is proportional to the risk. Stripe mitigates this with restricted keys (plat-06), granular permissions on those restricted keys, and the requirement that connected accounts explicitly authorize the platform during onboarding.
Stripe Connect also supports a separate OAuth flow that issues access tokens per connected account, giving you a per-account token model similar to Pattern A — useful when you need the platform to be able to act only on explicitly authorized accounts with independently revocable access.
Consent screens and scoped permissions per install
Every installed-app pattern involves a consent moment. The platform (GitHub, Stripe, Slack) shows the user a screen listing exactly what the app is requesting: which repositories, which Stripe capabilities, which Slack channels. The user's choices at that moment define the permission envelope for the installation.
Well-designed consent screens follow the principle of minimal disclosure: request only the permissions the app genuinely needs at install time. Apps that request everything "just in case" see lower conversion rates and more revocations. GitHub even lets apps request permissions on demand — the app starts with minimal permissions and requests additional ones later, triggering a new consent prompt only when needed.
Each installation's permission set is stored by the platform and included in every token it issues for that installation. Your app cannot simply request broader permissions at token refresh time — the permissions in the returned token are bounded by what was consented to at install time, not what the app manifest lists.
Diagrams: the three authentication flows
Consent screens and managing revocation/uninstall
When a user revokes an installation (uninstalls the app, disconnects their Stripe account, removes the Slack workspace), the platform immediately invalidates all tokens issued for that installation. Your app must handle this gracefully:
- Listen for the webhook. GitHub sends an
installation.deletedevent; Stripe sends aaccount.application.deauthorizedevent. Slack sends anapp_uninstalledevent. Subscribe to these and treat them as authoritative. - Purge the token from your cache immediately. Do not wait for the next refresh cycle. A cached token for a revoked installation will return 401 errors, but you will also be attempting to call resources on behalf of an account that no longer wants you to.
- Stop background jobs for that installation. Any scheduled sync, poll, or background task keyed to that installation_id must be halted. Failing to do this is a privacy violation if you continue reading data after authorization is withdrawn.
- Persist the revocation event. Mark the installation as revoked in your database so you do not accidentally re-add it if a stale message arrives out of order.
Under the hood: the app-JWT → installation-token exchange, step by step
Here is the exact sequence a well-implemented GitHub App server executes. Every step has a failure mode worth knowing.
Pros and cons: the three patterns
| Pattern | Token volume | Blast radius on leak | On-behalf-of clarity | Revocation | Complexity |
|---|---|---|---|---|---|
| (A) GitHub App — per-install token | 1 token per active install per hour; 10,000 installs = ~10,000 tokens in rotation | Low — one install's data only; 1-hr TTL auto-expires | Clear — the token is explicitly scoped to one installation's permissions | Platform invalidates immediately on uninstall; webhook notifies app | High — background refresh loop, cache, JWT signing, revocation listener |
| (B) User OAuth token | 1 token per active user session; may overlap with install tokens | Low — one user's accessible resources only | Very clear — actions attributed to the user in audit logs | User can revoke via platform settings; refresh token approach controls session length | Medium — standard OAuth flow; refresh token management |
| (C) Stripe Connect — platform key + header | One credential (platform secret key) for all accounts; zero per-account tokens | Very high — a leaked platform key compromises every connected account | Less clear — requests appear as the platform, not a distinct per-account token | Connected account revokes platform access; platform key itself revoked centrally | Low — no token exchange, no refresh cycle, just a header |
How real platforms do it
GitHub Apps pioneered the two-tier JWT model described in Pattern A. App JWTs are strictly 10 minutes; installation tokens are 1 hour. GitHub surfaces the per-install permission set in the token response and fires installation.deleted webhooks on revocation. See GitHub's JWT generation docs and authenticating as an installation.
Stripe Connect offers two modes: the platform-key-plus-header model for tight platform control, and an OAuth-based model where each connected account issues the platform an access token with explicit scopes — useful when you need independently revocable per-account credentials rather than a single platform key. See Stripe Connect authentication docs.
Slack apps use workspace-scoped OAuth tokens (Pattern B style) — one token per workspace installation. Slack's token rotation feature lets apps cycle tokens without downtime. The app_uninstalled event fires when a workspace removes the app. See Slack OAuth V2 docs.
Google Workspace Marketplace apps use service account credentials with domain-wide delegation — the service account can impersonate any user in the domain, making it conceptually similar to Pattern C (high blast radius, zero per-user token overhead). See Google Workspace Marketplace auth guide.
By the numbers: managing tokens for 10,000 installs
Installation tokens expire after 1 hour. To keep all 10,000 installs operational, your service must continuously refresh tokens as they approach expiry.
Steady-state refresh rate (modeled):
Storage: Redis cache for all tokens (modeled):
The cold-start thundering-herd problem (modeled):
Decision math — refresh buffer tradeoff: A small buffer (30 s before expiry) minimizes wasted refreshes on long-idle installs, but leaves a tight window for retry if the refresh fails. A large buffer (300 s) wastes ~8% of token lifetime on preemptive refreshes, but gives 5 retry attempts at 60-second intervals before the token actually expires. For most systems, 120–300 s is the right range. Label these numbers as heuristics and tune to your observed p99 refresh latency.
"Describe how a GitHub App authenticates on behalf of an installation." The complete answer has two tiers: the app JWT (RS256, 10-min, proves app identity) and the installation access token (1-hr, per-install scope, retrieved by exchanging the app JWT). Then cover operations: cache the installation token, refresh before expiry, listen for revocation webhooks, handle cold-start thundering herd with jitter. Candidates who only describe OAuth miss the two-tier structure that makes GitHub Apps different from user-OAuth delegation.
Checking token existence in cache, not token expiry. An installation token may be in your Redis cache but 59 minutes old. On the next request you serve it; it expires 60 seconds later mid-flight. Always store expires_at alongside the token and check expires_at - now() > buffer before serving the cached value. A missing expiry check turns a 3,600-second lifetime into an "occasionally fails mysteriously" bug that's hard to reproduce in staging.
Store the installation_id alongside the token, always. The token is what you use to call the API. The installation_id is what you use to refresh the token when it expires. If your cache stores only the token string (no installation_id, no expires_at), you have no way to refresh — you must re-run the entire install flow to get a new token. Store the full object: {token, expires_at, installation_id, permissions}. The extra 50 bytes per install is trivial; the operational capability is not.
🧠 Quick check
1. In the GitHub App model, the app JWT is signed with:
App JWTs are RS256-signed with the private RSA key you generate and register with GitHub. GitHub verifies the signature using the corresponding public key stored in the app configuration. The private key never leaves your server — this asymmetry is what allows GitHub to verify without holding a shared secret.
2. In the Stripe Connect model, how does your platform act on behalf of a connected account?
Stripe Connect uses your platform's own secret key for authentication; the Stripe-Account header tells Stripe which connected account's data and settings to operate on. No per-account token exchange is required — the trade-off is that the platform key carries very high blast radius if leaked.
3. A GitHub App installation access token is valid for:
Installation tokens have a 1-hour lifetime. The app JWT used to obtain them expires after 10 minutes. The two-tier design means a leaked installation token auto-expires after at most 1 hour, and a leaked app JWT is useless after 10 minutes — both intentionally short to bound the blast radius of a credential leak.
4. Your service restarts cold with 10,000 installations in a database but an empty token cache. What is the safest approach to warming the cache?
Fetching all tokens immediately creates a thundering herd: 10,000 simultaneous requests against GitHub's rate-limited token endpoint. Lazy refresh with startup jitter (random delay per install) spreads the load across the first several minutes of operation. Pre-generating tokens 24 hours in advance would produce expired tokens before the deploy window. A single shared token would be scoped to only one installation.
✍️ Exercise: design the token management system for a GitHub App with 5,000 installs
You are building a data-sync service that uses a GitHub App to read repository metadata across 5,000 customer organizations. The service runs on 4 application servers and syncs data every 15 minutes per installation.
Design the complete token caching and refresh strategy. Address: cache key schema, TTL, how refresh is triggered, what happens on cold start, how multiple servers avoid refreshing the same token simultaneously, and what happens when a refresh fails.
Model answer:
- Cache: Redis HASH or STRING per install: key =
token:install:{installation_id}, value = JSON{"token": "ghs_xxx", "expires_at": "...", "permissions": {...}}, TTL = 3,400 s (100 s buffer). - Refresh trigger: Before using a cached token, check
expires_at - now() < 120 s. If true, refresh. Background sweeper thread wakes every 60 s and refreshes any token expiring within 5 minutes. - Cold start: On startup, do NOT fetch all 5,000 tokens immediately. Use lazy load — fetch the token only when the first sync job for that installation fires. Sync jobs are staggered by design (different customers at different offsets), so the cold-start load naturally spreads over the first 15-minute sync cycle.
- Per-install locking: Use a Redis distributed lock (
SET token:lock:{id} 1 NX PX 30000) before refreshing. If the lock is held, another server is already refreshing — wait and then re-read the cache. This prevents 4 servers from all refreshing the same token simultaneously, consuming 4× the rate-limit budget. - Failure handling: If a refresh returns 401 (installation revoked), mark the install as revoked in the database, emit an alert, and skip that installation in future sync runs. If it returns a transient 5xx, retry with exponential backoff (1s, 2s, 4s) before failing the sync job for that cycle.
Rubric: Full marks for covering (a) TTL with expiry buffer, (b) check-before-use expiry check, (c) background refresh sweep, (d) cold-start lazy loading (not thundering herd), (e) per-install distributed locking, (f) 401 revocation handling vs transient 5xx retry.
Key takeaways
- The installed-app model differs from user OAuth because one app credential must operate across thousands of independent accounts, each with its own permission scope and revocation lifecycle — a 1:N problem, not 1:1.
- GitHub Apps use a short-lived app JWT (RS256, 10 min) to obtain per-install tokens (1 hr, scoped to that install's granted permissions). The two-tier design keeps blast radius small at both credential levels.
- Stripe Connect uses your platform secret key plus a Stripe-Account header — no per-account token exchange needed. The simplicity is real, but so is the risk: the platform key controls all connected accounts simultaneously.
- At 10,000 installs, steady-state token refresh generates roughly 3 refreshes/second — manageable. The danger is cold start: a cache-empty restart must not fire all 10,000 exchanges at once. Stagger with jitter.
- Revocation handling is a first-class concern: when a user uninstalls, purge the token immediately, stop all background jobs for that install, and persist the revocation event to prevent stale replays.