← Obsta Labs

Your LLM Proxy Is Your Biggest Attack Surface

April 2026 · Obsta Labs

Every LLM proxy sees everything.

Every prompt. Every response. Every API key. Every secret pasted into a code review at 2AM.

The proxy is the most privileged component in the AI stack. And almost nobody treats it that way.

Three incidents in early 2026 made this obvious.

Proxies stealing credentials (by design)

A 2026 study tested LLM proxy services and found 26 collecting user credentials.

No exploit. No breach. Just: we sit in the right place, so we take everything.

These weren't shady tools. They had paying users, SOC 2 badges, and privacy policies saying the opposite.

The LiteLLM supply chain breach

Attackers compromised LiteLLM via a CI/CD dependency, published poisoned versions to PyPI, and had access for about 40 minutes.

That was enough.

Mercor — a $10 billion company handling sensitive AI data — was among the victims. Alleged impact: API keys, passport scans, interview recordings, source code, internal communications. Five class-action lawsuits followed. One represents 40,000+ people.

The certifications (SOC 2, ISO 27001) were real. The security was not.

One operator, AI as an attack team

A single attacker used Claude Code and GPT-4.1 to compromise nine government agencies.

Not with zero-days. With unpatched systems, weak passwords, and missing segmentation — faster than detection.

AI didn't invent new attacks. It made old ones faster than response.

The pattern

Every failure came from trusting the wrong layer:

Assumption                        Reality
─────────────────────────────     ─────────────────────────────
Proxy is just a router            Proxy sees everything
Certification = security          Auditor may be useless
Guardrails stop abuse             Prompts route around them

What actually works

Remove the trusted third party. If your proxy runs locally, there is no central database of secrets to steal.

Never store credentials. Forward at the transport layer. No logs. No disk. No state.

Make behavior verifiable. If you can't inspect connections and outputs, you don't control it.

Prefer deterministic controls. A regex that matches sk-ant- is auditable. A classifier that "usually catches secrets" is a guardrail that works until it doesn't.

If the proxy can be compromised remotely, it eventually will be. The fix isn't better alerts. It's removing the attack surface.

What we built

NeuroRouter is a local trust boundary enforcement layer. It runs on your machine. Your API keys never leave your network.

Zero telemetry. Zero analytics. Zero phone-home. Zero online license validation. Local Ed25519 signature verification.

Zero credential storage. Keys are forwarded at the HTTP transport layer. Nothing is written to disk, logs, or state database.

One binary, no runtime dependencies. No Python packages to poison. No npm supply chain to compromise.