AI Agent Security Audit: Before You Deploy

Before your AI agent ships to production, pull up its config and count the API keys you’ve handed it. Now ask: if it gets prompt-injected, which of those keys can an attacker use? If your answer isn’t “none of them,” your agent is holding credentials it shouldn’t.

Static secret scanning won’t catch this. Neither will a manual code review. The attack surface for agents is runtime: what keys live in the context window, which operations they grant, and how fast you can revoke them when something goes wrong.

This checklist covers the six things you need to verify before an agent touches a live environment. Most teams skip at least three of them. The ones they skip are usually the ones that matter most.

Why agents break the old security model

Traditional secret management assumes secrets live in code or config files. Scan the repo, rotate on a schedule, done. Agents break that assumption in a few ways.

Agents operate across multiple API surfaces at the same time. A single agent handling a customer support workflow might hold keys for your CRM, a payments API, a vector database, and an email provider, all in one runtime context. Each of those is a separate blast radius.

Runtime credential injection is the real problem. Your static secret scanner catches keys committed to git. It won’t catch keys injected into an agent’s system prompt at runtime, or surfaced through a tool call response. By the time those credentials exist, they’re in the LLM’s context window, not in your codebase.

The other thing that catches teams off guard: agents can be prompted to expose credentials they’ve been given. Not through a CVE. Through a carefully crafted user message, a malicious document the agent was asked to summarize, or poisoned content in a tool call response. If the credential is readable in context, a sufficiently motivated adversarial input can get it back out.

Audit item 1: Map every credential the agent can touch

Start with a full inventory. Before any other check, you need to know exactly what the agent has access to.

List every API key, token, secret, and credential passed to the agent at runtime. Include what’s in the system prompt, what’s injected via tool call context, and what’s available through environment variables. Treat all three separately because the exposure paths are different.

For each credential, note whether it grants read-only or write access. Flag anything that grants delete, admin, or billing access. Those warrant extra scrutiny.

The question that trips most teams up: which of these credentials can you revoke in under 60 seconds? If the answer isn’t “all of them,” that’s a problem to fix before deployment.

Audit item 2: Check credential scope against actual agent tasks

Over-permissioning is the most common issue, and it’s almost always unintentional.

It happens because developers grab credentials that already exist in the environment. A key that was originally scoped for a human admin workflow gets reused for an agent that only needs read access to one table. The agent ends up with far more access than it needs, and nobody notices until something goes wrong.

For each credential in your inventory, document the minimum permission set required for what the agent actually does. Then compare that against what the credential actually grants. The delta is your over-permissioning exposure.

Database credentials are the most frequent offender. Agents almost never need DDL access, but they frequently end up with credentials that have it. If your agent is reading from a Postgres table to answer user queries, it does not need CREATE, DROP, or ALTER permissions. Scope it down.

Same goes for cloud provider credentials. An agent that reads from S3 buckets does not need IAM admin access, even if that’s what the engineer’s own credentials happen to have.

Audit item 3: Calculate your blast radius

Blast radius is a more useful framing than “risk level” because it forces you to think concretely about worst-case outcomes instead of assigning vague severity scores.

For each credential in your inventory, answer one question: what’s the maximum damage if this credential is leaked or misused by a compromised agent? Write it down. Be specific. “Attacker can read all customer records” is a blast radius. “Attacker can read, write, and delete all customer records, billing history, and email logs” is a different blast radius.

Once you have individual blast radii, calculate the composite. If a single agent holds credentials across five services and each one’s worst case is bad, the combination is worse. An attacker who can pivot from a leaked key into additional credentials multiplies the damage.

Pay specific attention to credential chaining. An AWS IAM role that has SecretsManager read access isn’t just one credential; it’s the master key to however many secrets are stored there. An OAuth token with admin scope on a GitHub org can access every private repo. Map those dependency graphs.

Audit item 4: Test for prompt injection credential exfiltration

This is the test most teams skip, and it’s the one that gets people. Run it before you deploy.

The goal is to verify that an adversarial input cannot cause the agent to repeat credentials in its output. That covers user messages, but also every other input vector the agent processes: uploaded documents, web pages the agent fetches, tool call responses that include third-party content.

The basic test: inject a message into each input vector that asks the agent to repeat its system prompt, list environment variables, or output the credentials it has access to. A well-configured agent should refuse or respond that it doesn’t have access to that information. An agent with credentials readable in its context window will often comply.

Also check whether tool call outputs are logged anywhere downstream. If your observability stack is capturing full agent traces, those traces may include credential values that appeared in tool call responses. That’s a separate exposure path that doesn’t require any adversarial input at all.

Finally, check whether error messages expose credentials. Some agent frameworks include context in error outputs by default. If an API call fails because of an invalid key, the error might include the key itself. That’s worth a few minutes of review.

Audit item 5: Verify credential revocation is actually possible

Most teams don’t test this until they need it. By then, it’s too late to find out the process takes four hours and requires someone who’s on vacation.

For each credential in your inventory, document the steps to revoke it and your realistic estimate of how long full revocation takes. “Full revocation” means the credential is invalidated everywhere, including any caches, CDN edges, or downstream services that might have stored it.

Then test it. Actually revoke a test credential and confirm the agent fails gracefully. This tells you two things: the revocation works, and the agent’s error handling doesn’t do something worse when a credential suddenly becomes invalid (like retrying in a loop or exposing the failed credential in logs).

Identify any credentials in your inventory that don’t have a programmatic revocation path. Some legacy APIs issue API keys that can only be revoked through a support ticket or a web UI with no API. That’s a risk you need to explicitly accept or mitigate.

The last question: does everyone on your team know who to call at 2AM on a Sunday to revoke each credential? If the answer is “we’d have to figure it out,” that’s a gap in your incident response, not just your security posture.

Audit item 6: Audit runtime credential delivery

How credentials get into the agent’s context is where the fundamental architectural choice sits.

Environment variables in containerized agents are readable if the LLM is prompted to inspect the environment. That’s not theoretical; it’s been demonstrated repeatedly. Passing credentials as environment variables and then assuming the agent can’t access them is not a security control.

System prompt injection is the most common pattern and the most dangerous. If a credential is in the system prompt as a readable string, it’s in every message the agent processes. Every request carries it. Every log entry might capture it. The credential has gone from “stored securely” to “embedded in every conversation.”

The safer pattern is runtime credential injection via a proxy. The agent requests access to a downstream service; the proxy validates the request, performs the actual API call, and returns results without the agent ever seeing the raw credential. The agent never holds a key it could leak.

Session-scoped tokens are significantly safer than long-lived keys. A token that expires after a single task or a short session limits the window an attacker has to use a leaked credential. If your agent gets a fresh token per task, a leaked token from task 47 is useless by the time anyone acts on it.

Audit your agent's blast radius before it ships.

API Stronghold scopes credentials per agent, calculates blast radius automatically, and issues session-scoped tokens so your agents never hold raw API keys in context.

Start Free — No Credit Card Required See how it works

No credit card required

The audit checklist

Run through this before any agent ships to production:

Credential inventory complete. Every key the agent can touch is documented, with injection method noted.
Permissions scoped to minimum required. Compared actual grants against minimum needed, and reduced where possible.
Blast radius calculated. Individual and composite worst-case documented for each credential.
Prompt injection exfiltration test passed. Ran adversarial inputs against all input vectors.
Error message audit complete. Confirmed error outputs don’t expose credential values.
Tool call trace audit complete. Confirmed observability stack isn’t logging raw credentials.
Revocation process tested and documented. Ran a test revocation and confirmed graceful failure.
Runtime delivery reviewed. Agent does not hold raw long-lived credentials in readable context.
On-call list exists. Someone knows who to wake up for each credential if needed.

Most teams will get most of this checklist wrong on their first pass. That’s not a judgment; the tooling for agent credential security is still catching up to the threat model. The frameworks make it easy to inject credentials into system prompts and hard to do the safer thing. That gap is exactly what API Stronghold was built to close.

Run this audit. Then fix what it finds.

Scoped credentials, blast radius reporting, and phantom token delivery. Your agents get access; you keep control. Zero-knowledge vault, session-scoped tokens, one-call revocation.

Start Free — No Credit Card Required See the Phantom Token Pattern

No credit card required

AI Agent Security Audit: Before You Deploy

Why agents break the old security model

Audit item 1: Map every credential the agent can touch

Audit item 2: Check credential scope against actual agent tasks

Audit item 3: Calculate your blast radius

Audit item 4: Test for prompt injection credential exfiltration

Audit item 5: Verify credential revocation is actually possible

Audit item 6: Audit runtime credential delivery

Audit your agent's blast radius before it ships.

The audit checklist

Run this audit. Then fix what it finds.

Keep your API keys out of agent context

Get posts like this in your inbox

One vault for all your API keys

AI Agent Security Audit: Before You Deploy

Why agents break the old security model

Audit item 1: Map every credential the agent can touch

Audit item 2: Check credential scope against actual agent tasks

Audit item 3: Calculate your blast radius

Audit item 4: Test for prompt injection credential exfiltration

Audit item 5: Verify credential revocation is actually possible

Audit item 6: Audit runtime credential delivery

Audit your agent's blast radius before it ships.

The audit checklist

Run this audit. Then fix what it finds.

Related Reading

Keep your API keys out of agent context

Get posts like this in your inbox

Your agent has keys it doesn't need. That's the attack surface.

One vault for all your API keys