← Back to Blog
· 9 min read · API Stronghold Team

MCP Servers Don't Need Long-Lived API Keys

Cover image for MCP Servers Don't Need Long-Lived API Keys

Every MCP server you deploy with a hardcoded API key is a breach waiting for a timestamp.

Long-lived credentials in MCP servers aren’t a configuration choice. They’re a liability. The pattern to eliminate them has five steps, real endpoints, and leaves zero long-lived secrets in your runtime environment. That’s what this post covers.

Why long-lived keys in MCP servers are a different problem

Web apps with hardcoded credentials are bad. MCP servers with hardcoded credentials are worse, for a few specific reasons.

MCP servers are multi-tenant by design. One leaked key doesn’t just compromise one client. It compromises every client the server handles. The blast radius isn’t “someone read our database.” It’s “someone can take every action every connected agent is authorized to take.”

That distinction matters because MCP servers broker credentials to AI agents that act autonomously. A web app with a leaked Stripe key is bad. An AI agent with a leaked Stripe key can refund customers, modify subscription plans, and create payouts without any human in the loop. The credential exposure and the action surface are directly coupled.

There’s also an auth gap that’s specific to the AI stack. If you haven’t read about it yet, the MCP auth gap post covers it in detail. The short version: most MCP deployments have no credential boundary between the model, the tools, and the upstream APIs those tools call. A long-lived key collapses that boundary entirely.

The fix isn’t “rotate your keys more often.” Rotation helps, but it doesn’t change the fundamental architecture. What changes the architecture is making sure the MCP server never holds a real credential at runtime in the first place.

The pattern: register, scope, issue, validate, expire

Each step in the pattern has one owner and produces one artifact. The MCP server never sees your real API key. It receives a session-scoped token that maps to the real credential inside a vault.

Here’s the flow:

 Platform Engineer           Vault / Broker              MCP Server         Upstream API
       |                          |                           |                    |
       |-- POST /agents --------->|                           |                    |
       |   (register agent)       |                           |                    |
       |                          |                           |                    |
       |-- POST /agents/:id/scopes|                           |                    |
       |   (declare scopes)       |                           |                    |
       |                          |                           |                    |
       |           Agent starts a session                     |                    |
       |                          |<-- POST /sessions --------|                    |
       |                          |    (request token)        |                    |
       |                          |-- token (15min TTL) ------>|                   |
       |                          |                           |                    |
       |                          |          per-request      |                    |
       |                          |<-- POST /validate --------|                    |
       |                          |-- valid + scope match --->|                    |
       |                          |                           |-- call w/ real key->|
       |                          |                           |                    |
       |                          |      token expires         |                    |
       |                          |<-- POST /sessions/refresh--|                    |
       |                          |-- new token (15min TTL) -->|                    |

The vault holds the real key. The MCP server holds a token. When the token hits the validator, the vault makes the upstream call with the real credential. Your agent never touches the actual key.

Expiration is non-negotiable. A non-expiring token is functionally equivalent to a long-lived key. The whole point of the pattern collapses if tokens can sit valid indefinitely.

Step 1 and 2: register your agent and declare its scope

Before any token gets issued, the agent needs to exist in the system and have declared what it’s allowed to do.

Register the agent

curl -X POST https://vault.example.com/agents \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "claude-billing-agent",
    "owner": "platform-team",
    "environment": "production",
    "description": "Handles Stripe billing queries via MCP"
  }'

Three fields matter here and shouldn’t be skipped. agent_id is the stable identifier you’ll reference in every subsequent call. owner is the team accountable for this agent’s actions. That’s important for audit trails. environment lets you issue different scopes for prod vs staging without maintaining separate agent registrations.

Declare scopes

This is where most teams get it wrong. Scopes should map to tool calls, not to API endpoints. The difference is significant.

A bad scope:

{
  "scopes": ["read:*"]
}

This grants read access to everything the underlying credential can read. An agent querying invoices gets the same access as one querying customer PII. That’s not least privilege. That’s just slightly-reduced-blast-radius.

A good scope:

{
  "scopes": ["read:stripe:invoices", "read:stripe:customers:id-only"]
}

Now the token issued for this agent can read invoices and customer IDs. Nothing else. If the agent gets compromised or starts behaving unexpectedly, the damage is bounded before you’ve even looked at the logs.

curl -X POST https://vault.example.com/agents/claude-billing-agent/scopes \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "scopes": ["read:stripe:invoices", "read:stripe:customers:id-only"],
    "max_ttl_seconds": 900
  }'

The max_ttl_seconds field caps how long any token issued for this agent can be valid. Set it here, not in the application code where it’s easy to forget.

Step 3: issue a session-scoped token (not a key)

Token issuance happens at session start. When an agent begins a new task, it calls the vault to get a fresh token. The token isn’t a stripped-down version of your API key. It’s a completely separate credential that the vault understands but the upstream API never sees directly.

curl -X POST https://vault.example.com/sessions \
  -H "Authorization: Bearer $AGENT_SERVICE_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "claude-billing-agent",
    "purpose": "quarterly-invoice-reconciliation",
    "requested_scopes": ["read:stripe:invoices"]
  }'

Response:

{
  "session_id": "sess_01JNKQ8X4F2MVBP7RD3WCZH9GT",
  "token": "svt_prod_7f3k9m2p...truncated",
  "scopes": ["read:stripe:invoices"],
  "issued_at": "2026-03-15T16:00:00Z",
  "expires_at": "2026-03-15T16:15:00Z",
  "ttl_seconds": 900
}

The token field is what the MCP server gets. The real Stripe API key (sk_live_...) never appears in this response. The vault holds it internally and uses it when it proxies the call.

This is the phantom token pattern. The token looks like a credential. It behaves like a credential from the MCP server’s perspective. But it has no value outside the vault’s validation context. If it leaks, it expires in 15 minutes and only grants access to one specific scope.

Why 15 minutes and not 24 hours? A 24-hour token gives an attacker a full work day to exploit a compromise. A 15-minute token gives them the window between now and when they get around to using it. Most automated exploits don’t move that fast. More importantly, your legitimate agent doesn’t need more than 15 minutes per request. If it does, the rotation mechanism (covered in Step 5) handles that cleanly.

Step 4: validate on every request (not just the first)

Session-open validation is not enough. Validating the token once when the session starts and then trusting it for the session’s lifetime has the same problem as a long-lived key: you’ve moved the vulnerability window, not eliminated it.

Every request through the MCP server should trigger a validation call.

import httpx
from functools import wraps

VAULT_BASE = "https://vault.example.com"

def require_valid_token(required_scope: str):
    def decorator(handler):
        @wraps(handler)
        async def wrapper(request, *args, **kwargs):
            token = request.headers.get("X-Session-Token")
            if not token:
                return error_response(401, "missing token")

            async with httpx.AsyncClient() as client:
                resp = await client.post(
                    f"{VAULT_BASE}/validate",
                    json={
                        "token": token,
                        "required_scope": required_scope,
                        "request_id": request.headers.get("X-Request-Id"),
                    },
                    headers={"Authorization": f"Bearer {VAULT_SERVICE_KEY}"},
                )

            if resp.status_code != 200:
                return error_response(401, "invalid or expired token")

            result = resp.json()
            if not result.get("scope_match"):
                return error_response(403, "insufficient scope")

            # Attach validation result for audit logging
            request.state.token_validation = result
            return await handler(request, *args, **kwargs)

        return wrapper
    return decorator


@require_valid_token("read:stripe:invoices")
async def handle_list_invoices(request):
    # Token validated. Vault will proxy the actual Stripe call.
    ...

The validation call checks three things: token signature (is this token from our vault?), expiry (has it passed expires_at?), and scope match (does this token cover the operation being requested?).

On every validation event, log it. Not just failures. Every validation call should produce an audit record with the token ID, the agent ID, the requested scope, and the result. That trail is what lets you reconstruct exactly what an agent did during an incident.

For revocation: the vault keeps a revocation list. If you need to invalidate a token mid-session, call DELETE /sessions/{session_id}. Every subsequent validation call for tokens from that session will return 401, even if the tokens haven’t technically expired yet.

Step 5: expire aggressively and rotate without downtime

Hard expiry means the token stops working at expires_at. No grace period. No “close enough.” If the token is expired, the request fails. The MCP server should treat this as a normal operational condition, not an error.

Soft expiry is a signal to rotate. Set your rotation threshold at roughly 80% of the TTL. For a 15-minute token, start the refresh process at the 12-minute mark. That gives you a 3-minute window to issue the new token before the old one stops working.

import time

class TokenManager:
    def __init__(self, session_id: str, token: str, expires_at: float):
        self.session_id = session_id
        self.token = token
        self.expires_at = expires_at

    def needs_refresh(self) -> bool:
        # Refresh when 80% of TTL has elapsed
        remaining = self.expires_at - time.time()
        return remaining < 180  # 3 minutes

    async def refresh(self, vault_client) -> "TokenManager":
        resp = await vault_client.post(
            f"/sessions/{self.session_id}/refresh"
        )
        data = resp.json()
        return TokenManager(
            session_id=self.session_id,
            token=data["token"],
            expires_at=data["expires_at_unix"],
        )

The old token stays valid until it actually expires. You issue the new token before that happens. The agent’s in-flight requests complete with the old token, and new requests start using the new one. No downtime, no dropped requests.

When an expired token hits your validator, fail closed. Return 401. Do not attempt to refresh server-side. The agent is responsible for managing its token lifecycle. If it sends an expired token, it means the client-side rotation didn’t work, and the right response is a clean error that triggers a controlled re-auth.

Monitoring: alert on token reuse after expiry. That pattern means either a clock skew issue on one of your servers, or someone replaying captured tokens. Either way, you want to know about it. The audit logs from Step 4 give you the data. A simple query for validation_result == "expired" AND token_id IN (recently_valid_tokens) is enough to catch it.

What this looks like in a real MCP deployment

A Claude agent using Stripe for billing lookups and GitHub for PR status checks. Two different upstream APIs, both accessed through an MCP server.

Before this pattern: the MCP server has two secrets in its environment, STRIPE_API_KEY=sk_live_... and GITHUB_TOKEN=ghp_.... Both long-lived. Both with broad permissions because the team “wanted to be safe.” One compromised container gives an attacker full Stripe access and full GitHub access for however long it takes someone to notice and rotate.

After this pattern:

  1. Register two agents: claude-stripe-agent and claude-github-agent (or one agent with two scope groups).
  2. Declare scopes: read:stripe:invoices, read:stripe:subscriptions, read:github:prs:status. No write scopes unless the agent actually needs them.
  3. At session start, issue tokens. The MCP server receives svt_... tokens with 15-minute TTLs.
  4. Per-request validation against the vault. Vault holds the real keys and proxies calls.
  5. Token rotation at the 80% mark. Hard expiry at 15 minutes.

Blast radius before: full Stripe account access, full GitHub org access, indefinitely.

Blast radius after: read access to invoices and PR status, for the next 15 minutes at most.

That’s the architecture difference. The scopes are tighter, the exposure window is shorter, and no real credential ever touches the MCP server’s filesystem or environment.


The pattern works whether you build the vault layer yourself or wire it into an existing secrets manager. But if you’re deploying MCP servers in production today and don’t want to build and maintain a credential broker from scratch, API Stronghold handles the register, scope, issue, validate, expire lifecycle out of the box.

API Stronghold is a zero-knowledge credential vault built for AI agents and MCP servers. Your agents get session-scoped phantom tokens. Your real API keys never leave the vault.

Start a 14-day free trial at https://www.apistronghold.com. No credit card required.

New posts on AI agent security, secrets management, and credential leaks. No fluff.

Secure your API keys today

Stop storing credentials in Slack and .env files. API Stronghold provides enterprise-grade security with zero-knowledge encryption.

View Pricing →