Shipping an OAuth-protected remote MCP server: the spec, 3 security bugs, and a Cloud Run gotcha

#Shipping an OAuth-protected remote MCP server: the spec, 3 security bugs, and a Cloud Run gotcha
SkillDB already ran as a local MCP server — npx -p skilldb skilldb-mcp, the way most servers ship. It works, but it asks something of the user: install Node, run a command, manage an API key. The frictionless path everyone actually wants is the one Claude Desktop offers: Settings → Connectors → Add custom connector → paste a URL → sign in. No install, no key to copy around.
Getting there meant turning SkillDB into a remote, OAuth-protected MCP server. This post is the honest version of how that went — the parts of the spec you actually need, the three security bugs a review caught before launch, and the one infrastructure gotcha that broke the whole thing in production.
#What "OAuth for MCP" actually requires
The MCP authorization spec leans on a stack of OAuth RFCs, and a client like Claude Desktop won't connect unless you implement all of them. The minimum:
- A 401 challenge. When the MCP endpoint gets a request with no/invalid token, it must return
401with aWWW-Authenticate: Bearer resource_metadata="…"header. That header is the entire trigger — it's how the client discovers that this server wants OAuth. - Protected Resource Metadata (RFC 9728) at
/.well-known/oauth-protected-resource— declares the resource identifier and which authorization server to use. - Authorization Server Metadata (RFC 8414) at
/.well-known/oauth-authorization-server— advertises the authorize/token/registration endpoints, that you require PKCES256, and that you accept public clients. - Dynamic Client Registration (RFC 7591). Claude has no pre-shared
client_id, so it registers itself on the fly at a/registerendpoint. Skip this and the flow dead-ends immediately. - Authorization code + PKCE (S256, no
plain), a consent screen, and a token endpoint with refresh-token rotation.
The nice part: SkillDB already owns its users (Firebase auth + plans). So instead of bolting on a third-party identity provider, SkillDB became its own authorization server. The OAuth login is just the existing SkillDB sign-in; the token maps back to the user's plan so paid users get full skill content.
#The architecture decision that mattered most
The remote server's job is: validate the OAuth token → figure out who the user is → return skill content gated by their plan. The tempting shortcut is to have the MCP server forward an internal "trust me, this is user X on the Pro plan" token to your existing content API.
Don't. We'll come back to why in the security section — but the decision we landed on is that the MCP server loads content in-process, calling the same gating functions the public API uses, with the user resolved from the verified token. No internal grant crosses a network boundary, and the user's real API key is never minted, returned, or logged.
#The security review caught three account-takeover bugs
Before any of this shipped, we ran an adversarial security pass over the design. The verdict on the first draft was blunt: not safe to implement as-is. Three critical findings, all of which would have let one user reach another user's data:
1. The ID-token check was forgeable. The consent step verified the user's Firebase ID token with a bare verifyIdToken(token). That doesn't check revocation — a signed-out or disabled user's cached token would still mint a valid authorization code. The fix is one argument: verifyIdToken(token, / checkRevoked / true), plus asserting the token's audience is your project so a token minted for a different app can't be substituted. The entire consent flow rests on this one call; it has to be airtight.
2. The "internal grant" was a confused deputy. This is the shortcut from the architecture section. The first design had the MCP server mint a symmetric (HS256) "internal" JWT and send it to the public content API to unlock full content for a given user id. The problem: that API is internet-facing, and a symmetric secret means the same key signs and verifies. One leaked secret, and anyone could forge { user: and read any user's private content. The fix was to delete the primitive entirely — load content in-process so there's no forgeable token and no network hop to attack.
3. Open registration + a hand-waved consent screen = code theft. Dynamic Client Registration is necessarily open (that's how Claude registers). Combined with a weak consent step, an attacker could register their own redirect URI, lure a logged-in user to the consent page, and harvest the resulting authorization code. The fixes: a same-origin check on the consent POST, showing the user the stored client name and redirect host (never values echoed from the query string), and exact-match redirect validation with exact-hostname loopback matching — because startsWith('localhost') happily matches localhost.attacker.com.
Round those out with the table stakes — PKCE S256 required, authorization codes single-use and hashed at rest with a 10-minute TTL, refresh tokens rotated on every use with the whole family revoked on replay, one pinned canonical resource identifier across the PRM, AS metadata, and token audience — and the design went from "exploitable" to shippable.
#The Cloud Run gotcha that broke it in production
With all of that done and deployed, the first real Claude Desktop connection failed with:
This site can't be reached
https://0.0.0.0:8080/oauth/authorize?... ERR_ADDRESS_INVALID
The authorize endpoint validated the request and then redirected the browser to the consent page using the framework's request.nextUrl.origin. On Cloud Run, that resolves to the internal container bind address — http://0.0.0.0:8080 — not the public host. So the browser got sent to a dead address.
The fix is to derive the public origin from the proxy's x-forwarded-host (with a guard against 0.0.0.0/localhost and a sane fallback) for any user-facing redirect. The exact same bug then bit a second time: the consent endpoint's CSRF check compared the request Origin against nextUrl.host (again 0.0.0.0:8080), so every "Approve" returned 403 cross-origin until it compared against the public host instead. If you run behind a reverse proxy, audit every place you build a URL or compare a host from request internals.
#The payoff
Fixed, deployed, and verified end-to-end in Claude Desktop: paste the connector URL, sign in on the SkillDB consent screen, approve, and the agent calls skilldb_search → skilldb_get and gets the full skill content — not a metadata preview. Same model, sharper output, zero local install.
If you want to try it: add https://mcp.skilldb.dev as a custom connector in Claude Desktop, or use the local server with npx -p skilldb skilldb-mcp. And if you're building your own OAuth-protected MCP server, the lesson worth stealing is the boring one: write the discovery metadata exactly to spec, get the ID-token verification and the consent screen reviewed by someone trying to break them, and never trust a host or origin you read out of request internals behind a proxy.
Related Posts
Agentic Loops: Why the Best AI Coding Workflows Are Loops, Not Prompts
The teams shipping real work with coding agents have moved past one-shot prompts to a different shape entirely: the loop. Act → check against a hard gate → repeat until it converges. Here are the three invariants that make agentic loops safe, and eight loop patterns — test-and-fix, bug-hunt, migration, eval-driven, and more — for putting them to work.
June 18, 2026Deep DivesWhy Agents Suck at Architecture: skilldb-architect-styles
I spent six hours watching an agent try to design a house. It was like watching a blender try to paint a sunset. The results are technically impressive but emotionally void.
June 14, 2026Deep DivesWhy Agents Suck at Linux Admin: 2AM System Shutdown
Why agents with root access at 2 AM are a recipe for digital self-immolation, and what it teaches us about the limits of pure logic.
June 13, 2026