90-Day Log Retention Is 90 Days of Exposed API Keys

You’ve instrumented everything. Traces on every LLM call, span data flowing to your APM, LangSmith capturing every tool invocation for debugging. Your team can replay any agent run from the last 90 days.

Those replays include your API keys.

This isn’t theoretical. It’s a structural problem with how modern AI agent observability works, and almost nobody is talking about it.

How Observability Captures Secrets

Most AI agent frameworks emit telemetry by default. LangChain, LlamaIndex, the OpenAI Python SDK, they all have hooks that capture tool call inputs and outputs. When you plug in OpenTelemetry or a hosted tracing service, that data flows to your observability backend with full fidelity.

What counts as “full fidelity”? The entire payload. For an HTTP tool call, that means request headers, query params, and body. For a function call, it means every argument.

Consider a simple agent that searches a database using an API key in the header:

tools = [
    {
        "name": "search_customer_data",
        "description": "Search customer database",
        "parameters": {
            "query": {"type": "string"},
            "api_key": {"type": "string"}  # passed as argument
        }
    }
]

Every trace for that tool call captures api_key. Every replay, every debugging session, every engineer with read access to your Datadog workspace sees it.

Even when keys aren’t explicit function arguments, they show up in HTTP headers captured by auto-instrumentation:

span.http.request.headers.authorization = "Bearer sk-prod-xxxxxxxx"

OpenTelemetry’s HTTP instrumentation captures request headers by default on many SDK versions. Authorization, X-API-Key, X-Auth-Token - all of it flows through unless you explicitly suppress it.

The Retention Problem

Your application logs might have a 7-day retention policy. Your traces probably have 30, 60, or 90 days. Your LangSmith dataset has no expiry at all. It’s meant for long-term fine-tuning.

Every day that passes, your key-rotation window extends. Rotate your Stripe API key today, but the old key lives in your trace history for another 89 days. An attacker who compromises your observability backend weeks after the rotation still finds valid keys from before you rotated.

This is different from a code repo leak or an .env file exposure. With those, you rotate and you’re done. With trace data, you’re racing against retention windows you don’t always control.

Real Systems That Get Hit

Here’s where secrets actually land in popular stacks:

LangSmith: Tool call inputs and outputs are stored as structured JSON in your LangSmith project. The dataset you build for evaluation contains every argument passed to every tool. If any tool accepted an API key as a parameter, it’s in there.

Datadog APM with LLM Observability: Datadog’s LLM Observability product captures prompt text, tool calls, and responses. Auto-instrumentation of HTTP spans captures full headers. Their data scrubbing is opt-in and regex-based.

OpenTelemetry + Jaeger/Tempo: Standard OTEL HTTP instrumentation sets http.request.header.* attributes. Whether or not your collector strips them depends entirely on your pipeline configuration.

Helicone, Langfuse, Arize Phoenix: Each of these stores full request/response bodies for debugging. All require explicit configuration to redact sensitive fields.

Sentry AI Monitoring: Sentry’s LLM breadcrumbs capture function call data. Exception context often includes variable state at crash time.

None of these are doing the wrong thing. They’re logging what they’re designed to log. The problem is that secrets are mixed into that data.

Why This Surfaces in AI Agents Specifically

Traditional web apps have a narrower exposure surface. Credentials go in environment variables, they’re injected once at startup, and your logging doesn’t normally capture them.

AI agents change the equation. Credentials flow through tool call arguments, HTTP headers, and LLM context windows at runtime. Every invocation is a potential logging event. And the debugging loop for agents actively encourages maximum telemetry. You need those traces to understand what went wrong when the agent misbehaves.

The more you instrument for observability (which is the right thing to do), the more you expand the credential exposure surface.

What to Do About It

There are three layers to fix this:

1. Don’t pass credentials as tool call arguments

This is the root cause for most leaks. If your tool schema includes API keys, tokens, or credentials as parameters, those flow directly into traces. Instead, resolve credentials inside the tool implementation using a secrets manager. The agent never sees them, and neither does the trace.

# Don't do this
def search_database(query: str, api_key: str) -> dict:
    ...

# Do this instead
def search_database(query: str) -> dict:
    api_key = secrets_client.get_secret("database-api-key")
    ...

2. Configure scrubbing at the collector layer

OpenTelemetry Collector processors can redact sensitive spans before they reach your backend. Configure an attributes processor to block header keys that look like credentials:

processors:
  attributes:
    actions:
      - key: http.request.header.authorization
        action: delete
      - key: http.request.header.x-api-key
        action: delete

This is table stakes - do it even if you believe your tools don’t pass keys as arguments. Defense in depth.

3. Use scoped, short-lived credentials

If a credential does land in a trace, the damage is proportional to its blast radius and lifespan. Scoped credentials that expire in minutes or hours dramatically reduce what an attacker can do with an old trace.

API Stronghold’s phantom token pattern wraps long-lived keys in short-lived tokens that route through a proxy. Even if the token appears in a trace, it expires before the trace retention window closes. There’s nothing to exfiltrate after the fact.

4. Audit your observability backend access

Who can read traces in your Datadog org? Your LangSmith workspace? Your Grafana instance? Observability tools often get treated as read-only infrastructure with loose access controls. Every person with access can query for spans containing auth headers.

Apply least-privilege here the same way you would for your production database.

A Quick Audit You Can Run Now

Pull your last 10 agent traces and search for these patterns:

authorization
x-api-key
x-auth-token
bearer
sk-
api_key
token
secret

If any of those strings appear as span attributes or in tool call arguments, you have active leakage. The traces in your backend right now have that data going back to your retention window start.

The fix isn’t to instrument less. You need those traces. The fix is to make sure the data flowing through them is stripped of anything that would cause damage if it leaked.

The Broader Pattern

Observability infrastructure is becoming a primary attack target. As organizations get better at protecting application-layer secrets, attackers look for easier entry points. Logging pipelines, trace aggregators, and monitoring backends often have weaker security posture than production systems while containing rich operational data.

Your AI agent’s traces are no different from a full HTTP request log of your API traffic. Treat them that way.

Start with the tool schema audit. If your tools accept credentials as parameters, that’s the highest-priority fix. Everything else is defense in depth around a root cause you haven’t addressed yet.

API Stronghold provides scoped credential management and proxy-based key isolation for AI agent workloads. Try the phantom token pattern to eliminate credential exposure in agent traces.

90-Day Log Retention Is 90 Days of Exposed API Keys

How Observability Captures Secrets

The Retention Problem

Real Systems That Get Hit

Why This Surfaces in AI Agents Specifically

What to Do About It

A Quick Audit You Can Run Now

The Broader Pattern

Keep your API keys out of agent context

Get posts like this in your inbox

One vault for all your API keys

90-Day Log Retention Is 90 Days of Exposed API Keys

How Observability Captures Secrets

The Retention Problem

Real Systems That Get Hit

Why This Surfaces in AI Agents Specifically

What to Do About It

A Quick Audit You Can Run Now

The Broader Pattern

Keep your API keys out of agent context

Get posts like this in your inbox

Your agent has keys it doesn't need. That's the attack surface.

One vault for all your API keys