Active Exploitation · CISA KEV June 9, 2026

The AI middleware layer — the LiteLLM proxy, LangChain/LangGraph orchestration, and Langfuse observability that sit between your application and the model providers — has quietly become the soft underbelly of the enterprise AI stack. A wave of 2025–2026 CVEs proves it: one chain already reaches CVSS 10.0 and has been added to the CISA Known Exploited Vulnerabilities catalog, with confirmed exploitation in the wild. These are widely adopted, capable open-source projects — the issue is not the maintainers, it is the architectural posture of running unauthenticated, trust-everything middleware in front of your model providers and your secrets.

Most teams adopted this stack for convenience: one base URL to route across OpenAI, Anthropic, Bedrock, Azure, and Vertex; one framework to chain prompts and tools; one dashboard to watch tokens and latency. The convenience is real. But the security model underneath it was never designed for a world where a poisoned model response, a manipulated Host header, or a single error path can hand an attacker remote code execution or a wallet full of provider keys.

10.0
CVSS — LiteLLM chained
unauthenticated RCE
9.3
CVSS — LangChain
"LangGrinch"
KEV
CISA-listed
June 9, 2026

LiteLLM: From AI Gateway to Unauthenticated RCE

LiteLLM is the most widely deployed open-source AI gateway/proxy — a single OpenAI-compatible endpoint that fans out to dozens of providers. That central position is exactly why a flaw in it is catastrophic: it sits in the path of every model call, and it holds every provider key. In 2026 it accumulated a set of vulnerabilities that, taken together, turn the gateway itself into the breach.

CVE-2026-42271 Command Injection · KEV
LiteLLM MCP server test endpoints spawn arbitrary commands
LiteLLM's MCP server test endpoints accept a full server configuration — command, args, and environment — for the stdio transport, then spawn the supplied command as a subprocess on the proxy host. There is no sanitization. Affected versions: LiteLLM 1.74.2 through 1.83.6. CISA added it to the KEV catalog on June 9, 2026, confirming active exploitation in the wild.
CVE-2026-48710 Host-Header Auth Bypass
Starlette "BadHost" validation bypass
A Host-header validation bypass in Starlette (≤ 1.0.0), the ASGI framework underneath LiteLLM's proxy. By manipulating the Host header, an attacker can sidestep authentication entirely — the gate that was supposed to protect the dangerous endpoints simply does not fire.

Individually these are serious. Chained, they are terminal. Researchers at Horizon3.ai confirmed that combining CVE-2026-42271 with CVE-2026-48710 yields a CVSS 10.0, zero-credential, unauthenticated remote code execution path on vulnerable LiteLLM deployments. The exploit chain reads like a checklist of everything an AI gateway should never allow:

1
Bypass authentication with a forged Host header
Using CVE-2026-48710, the attacker manipulates the Host header so Starlette's validation passes the request through without credentials. No API key, no session, no token. The authentication layer is simply not in the path anymore.
2
Reach the unguarded MCP test endpoint
With auth bypassed, the attacker hits the MCP server test endpoint (CVE-2026-42271), which accepts a full stdio server config including the command to run.
3
Spawn an arbitrary command on the proxy host
The endpoint dutifully launches the supplied command as a subprocess on the LiteLLM host. No sanitization, no allow-list, no sandbox. This is full RCE on the most central, most credential-rich node in the AI stack.
4
Harvest every provider key the proxy holds
A LiteLLM proxy stores keys for every model provider it routes to. From RCE on that host, an attacker reads them all out of memory or environment — OpenAI, Anthropic, Bedrock, Azure, Vertex — plus whatever cloud IAM the host can reach.

The fix arrived in LiteLLM 1.83.7 (May 8, 2026), which added authorization controls and updated Starlette. But patching one chain does not change the posture, and 2026 was not LiteLLM's first incident.

The March 2026 supply-chain compromise

In March 2026, attackers slipped malicious code into official LiteLLM releases on PyPI. The payload was a credential stealer — designed to exfiltrate secrets across cloud environments, CI/CD pipelines, and developer machines that installed the poisoned package. Trend Micro, Snyk, and HeroDevs all reported on the compromise. For a package that, by design, is wired into the most secret-rich path in your infrastructure, a single tainted release is a direct line to your provider keys and cloud credentials.

CVE-2025-0330 Credential Leak
LiteLLM leaks Langfuse API keys through an error path
In proxy_server.py (v1.52.1), an error while parsing team settings exposes langfuse_secret and langfuse_public_key. A failure in the gateway's config handling silently burns the credentials of the observability layer bolted to it — the first sign of just how tightly coupled, and how mutually dependent for failure, these projects are.
The pattern, not the project

None of this means LiteLLM is badly built — it is a capable, popular gateway. It means an unauthenticated, secret-holding, command-spawning proxy at the center of your AI stack is a structurally dangerous place to keep convenience. When the gateway is breached, everything behind it is breached. The fix is not "patch and pray for the next CVE"; it is to put a governed, authenticated, isolated router in that position instead.

LangChain: Prompt Injection That Becomes Code Execution

LangChain (and LangGraph) is an orchestration library, not a gateway — it lives inside your application process, chaining prompts, tools, and state. That makes its vulnerabilities a different shape: instead of an external network attacker, the threat rides in on the data your application already trusts — specifically, model output.

CVE-2025-68664 "LangGrinch" CVSS 9.3 Critical
Serialization injection in LangChain dumps()/dumpd()
LangChain's serialization helpers (dumps()/dumpd()) do not escape free-form dictionaries containing lc keys. As a result, user-controlled data is later deserialized as a trusted LangChain object. Impact: extraction of secrets from the environment (when secrets_from_env=True), instantiation of arbitrary classes in trusted namespaces (langchain_core, langchain, langchain_community), and remote code execution via Jinja2 templates. Reported December 4, 2025 by Yarden Porat.

The vector is what makes LangGrinch so dangerous in agentic systems. The primary entry point is the LLM response itself — fields like additional_kwargs and response_metadata — which are controlled via prompt injection and then serialized and deserialized during streaming. The chain looks like this:

1
Prompt injection poisons the model's response fields
An attacker plants instructions in any untrusted content the model ingests — a web page, a document, an email, a tool result. The model emits attacker-controlled data into additional_kwargs / response_metadata.
2
LangChain serializes and re-deserializes the poisoned object
During streaming, those response fields pass through dumps()/dumpd() and back. Because lc-keyed dicts are not escaped, the poisoned payload is reconstituted as a trusted LangChain object.
3
Secret exfiltration or RCE
The deserializer can read secrets from the environment, instantiate classes in trusted namespaces, or render a Jinja2 template — the path to remote code execution. A prompt-injection string has become code running in your application.
CVE-2025-68665 CVSS 8.6
The same LangGrinch flaw in LangChain.js
The identical serialization-injection class affects the JavaScript/TypeScript implementation, rated CVSS 8.6 — so Node-based agent stacks inherit the same prompt-injection-to-RCE path.
CVE-2025-67644 SQL Injection
SQLi in LangGraph's SQLite checkpoint implementation
LangGraph's SQLite checkpointer is vulnerable to SQL injection. Agent state — which often carries untrusted, model-influenced content — flows into the checkpoint store, giving an attacker a path to tamper with or read persisted graph state.
CVE-2026-34070 Path Traversal
Path traversal in LangChain enables arbitrary file access
A path-traversal flaw allows arbitrary file access without validation — another route by which model-influenced input reaches the filesystem in ways the application never intended.

The patches are the right shape: a new allowed_objects allowlist in load()/loads(), Jinja2 blocked by default, and secrets_from_env now defaulting to False. But the lesson stands — in an agentic system, model output is untrusted input, and any layer that deserializes it as trusted is one prompt-injection away from code execution. That defense belongs in front of orchestration, not inside it.

Langfuse & the Coupling Problem

Langfuse is the observability layer many of these stacks bolt on — traces, token counts, latency, prompt history. The problem is not a single dramatic RCE; it is coupling. Observability is only as secure as the gateway it watches, and the gateway is only as secure as its worst error path.

CVE-2025-0330 above is the clearest illustration: a flaw in the LiteLLM gateway leaks the Langfuse public and secret keys through an error path. A bug in one product burns the credentials of the adjacent one. When you bolt observability onto an insecure gateway, the observability layer inherits the gateway's blast radius — and its keys.

The deployment surface adds to it. Self-hosted Langfuse Docker images shipped with vulnerable Node/npm packages; the issue was addressed in release 3.143.0. The broader pattern is unmistakable: the adjacent low-code AI tool Langflow — a different product, not to be confused with Langfuse — also had a vulnerability added to the CISA KEV catalog in May 2026. Across the AI middleware ecosystem, unsecured tooling is being actively exploited.

The coupling tax

Three products, three trust boundaries that don't actually exist. The gateway holds the observability keys. The orchestration library trusts model output. The observability layer trusts the gateway. A breach anywhere is a breach everywhere. Decoupling these layers — with authentication, isolation, and a vaulted secret store between each — is the only thing that turns one compromise into a contained one.

Thorough Comparison

Here is a fair, factual comparison across the four layers. LangChain is an orchestration library, not a gateway, so several gateway-specific rows are marked N/A for it — that is not a knock, it is a category difference. The point is the security posture each layer brings by default.

Capability LiteLLM proxy LangChain Langfuse RuntimeAI
Authentication model API key; bypassable via Host-header (CVE-2026-48710) N/A (in-process library) Public/secret key pair; leaked via gateway error path Mandatory, identity-bound JWT + tenant context; Bot-CA mTLS for agents
Supply-chain integrity PyPI compromise (Mar 2026) shipped credential stealer Multiple CVEs in core + community packages Vulnerable Node/npm in Docker images (< 3.143.0) Signed builds; PQC-enveloped secrets never in plaintext for a stealer to grab
MCP handling Test endpoint spawns arbitrary commands (CVE-2026-42271) Delegates to host MCP client N/A Governed MCP Gateway: tenant ACLs, validation, audit, mTLS; no arbitrary-command path
Secret storage Provider keys in plaintext env; leaked on error Reads secrets from env (secrets_from_env) Keys leaked by coupled gateway QuantumVault / PQ TokenVault: PQC-enveloped, short-lived, minted on demand
Multi-tenancy / isolation Shared proxy; broad blast radius In-process; app-defined Project-scoped; shared infra Multi-tenant RLS enforced at the data layer
DLP & prompt-injection defense None by default None — trusts model output (LangGrinch) None (observe-only) Bidirectional AI Firewall: inbound/outbound scan, PII redaction, injection detection
Audit trail Request logs Application-defined callbacks Trace store (plaintext key coupling) Tamper-proof Audit Black Box: Merkle-chained, PQ-Sign timestamped
Kill switch None native N/A N/A Instant L1/L2/L3 revocation of provider, agent, or credential
On-prem / sovereign Self-host proxy Self-host library Self-host (vulnerable image history) First-class routing to on-prem Sovereign LLM Platform (3 tiers), air-gap capable
Cost governance Basic cost tracking N/A Observe-only token/latency Real-time tracking + budget enforcement + policy-driven routing
Failover / resilience Provider fallbacks N/A N/A Retry with backoff on provider errors

And on the feature axis people actually adopt LiteLLM for — multi-provider routing and cost control — RuntimeAI's Secure LLM Router matches the convenience and adds the governance that was missing:

Routing / cost / failover feature LiteLLM proxy RuntimeAI Secure LLM Router
Multi-provider routingYes (OpenAI, Anthropic, Bedrock, Azure, Vertex, vLLM/Ollama)Yes — 17 wired providers + self-hosted vLLM/Ollama, one OpenAI-compatible base URL (full list below)
Routing strategyCost / latency / loadPolicy-driven (OPA/Rego), compliance- & latency-aware
Cost trackingYesReal-time + budget enforcement
On-prem / sovereign targetSelf-host modelsFirst-class routing to Sovereign LLM Platform (air-gapped/regulated)
Auth & isolationAPI key (bypassable)JWT + tenant context + Bot-CA mTLS; multi-tenant RLS

How the Router Handles a New Model Drop

A fair question when you put a gateway in your critical path: what happens the day OpenAI or Anthropic ships a new model? With many setups the answer is “wait for an SDK bump or a code change.” With the Secure LLM Router, the answer is nothing.

The router is a governed passthrough. The model name in your request is forwarded to the provider you configured, so a new model from a provider you already use works the moment that provider exposes it — no redeploy, no SDK upgrade. Three governance layers wrap that passthrough without getting in its way:

Wired providers today: OpenAI, Anthropic, Azure OpenAI, Google Gemini & Vertex, Mistral, Cohere, Groq, Together, Fireworks, Perplexity, DeepSeek, xAI, Cerebras, SambaNova, OpenRouter, and AI21 — plus any OpenAI-compatible self-hosted endpoint (vLLM, Ollama, TGI) and RuntimeAI’s on-prem Sovereign LLM Platform. Providers with non-standard auth get a dedicated adapter; OpenAI-compatible providers route through the unified path. The honest version of “works with everything” is this: any model from a provider you’ve connected works without a code change — and connecting a new OpenAI-compatible provider is a config entry, not a release.

It's Not Just These Three — Map for the Whole Middleware Layer

LiteLLM, LangChain/LangGraph, and Langfuse are the most visible names, but the same governance gap runs through the entire AI-middleware category. Whatever you run today, there is a governed RuntimeAI layer to swap it for — one at a time:

Middleware category Common tools RuntimeAI layer to switch to
AI gateway / model routerLiteLLM, Portkey, Kong AI Gateway, Cloudflare AI Gateway, OpenRouter, BifrostSecure LLM Router
Agent orchestrationLangChain, LangGraph, LlamaIndexKeep it — front it with AI Firewall + PII Shield + Flow Enforcer
Observability / tracingLangfuse, Helicone, LangSmith, Arize PhoenixAgent Observability + Audit Black Box (tamper-proof)
Guardrails / safetyGuardrails AI, NeMo Guardrails, LakeraAI Firewall (bidirectional DLP, prompt-injection, content policy)
MCP / tool routingLiteLLM MCP, raw MCP serversMCP Gateway (Bot-CA mTLS, tenant ACLs, audit)
LLM key storage.env files, plaintext config, k8s secretsQuantumVault / PQ TokenVault (PQC-enveloped, short-lived)
Agent memoryAd-hoc vector storesMemory Vault (governed writes, injection detection, TTL)

You Don't Have to Rip It Out — Switch One Layer at a Time

This is the part that matters most. You do not have to do a forklift migration to fix this. The AI middleware stack is modular, and so is the fix. Keep your application, your prompts, and your agents exactly as they are — and swap out the risky layer underneath them. Adopt a single RuntimeAI sub-product to close one specific CVE class, or adopt the whole platform for defence-in-depth. The migration is incremental and reversible.

The smallest possible change is a base-URL swap. The Secure LLM Router is OpenAI-compatible, so most apps move with one environment variable — and the provider key stops living in plaintext:

# Before — LiteLLM proxy: shared key in env, unauthenticated control plane
OPENAI_API_BASE=http://litellm-proxy:4000/v1
OPENAI_API_KEY=sk-litellm-master-key        # long-lived, sits in .env

# After — RuntimeAI Secure LLM Router (drop-in, OpenAI-compatible)
OPENAI_API_BASE=https://router.runtimeai.io/v1
OPENAI_API_KEY=${RUNTIMEAI_TOKEN}           # short-lived JWT, minted by QuantumVault
# every call now carries tenant context + Bot-CA mTLS; your routing
# and cost policy are unchanged — only the security model is.
01
Replace → LiteLLM proxy
RuntimeAI Secure LLM Router — drop-in OpenAI-compatible base URL
Point your existing OpenAI-compatible client at the RuntimeAI Secure LLM Router instead of the LiteLLM proxy. Same multi-provider routing and policy-driven strategies — but every call carries a JWT + tenant context, there is no Host-header auth-bypass class, and there is no "spawn a command from a test endpoint" path. Closes the CVE-2026-42271 + CVE-2026-48710 chain.
02
Front → LangChain orchestration
AI Firewall + PII Shield + Flow Enforcer — catch poisoned response fields
Keep LangChain/LangGraph. Put RuntimeAI's bidirectional AI Firewall, PII Shield, and Flow Enforcer in front of it: tokenize sensitive content before it reaches the LLM, and scan model output for injection patterns before it reaches your deserializer. The LangGrinch vector — a poisoned additional_kwargs / response_metadata field — is caught at the egress/ingress boundary, before it is reconstituted as a trusted object.
03
Replace → Langfuse
Agent Observability + Audit Black Box — tamper-proof, no key coupling
Swap Langfuse for RuntimeAI's Agent Observability (OTEL metrics and traces) plus the Audit Black Box — a Merkle-chained, PQ-Sign-timestamped audit trail. You get the dashboards and traces you relied on, without the plaintext-key coupling that let a gateway error path leak the observability layer's credentials.
04
Replace → LiteLLM MCP
RuntimeAI MCP Gateway — governed, Bot-CA mTLS
Move MCP server connectivity behind the RuntimeAI MCP Gateway. MCP servers run with tenant ACLs, rate limits, full audit, and Bot-CA mutual TLS; every tool call is validated before it reaches a server. There is no unauthenticated test endpoint that accepts a command and args to spawn — the entire CVE-2026-42271 attack surface simply does not exist.
05
Replace → plaintext env keys
QuantumVault / PQ TokenVault — PQC-enveloped, short-lived secrets
Stop keeping provider keys in plaintext environment variables where a credential stealer (the March 2026 supply-chain payload) or an error path (CVE-2025-0330) can read them. Provider keys live in QuantumVault / PQ TokenVault, PQC-enveloped and minted short-lived on demand. There is no static plaintext key for a stealer or a stack trace to leak.
Adopt incrementally, or all at once

Each card above is independently deployable. Worried only about the CVSS 10.0 RCE? Start with the Secure LLM Router and MCP Gateway. Worried about LangGrinch and prompt injection? Front your existing LangChain with the AI Firewall + Flow Enforcer. Want the whole defence-in-depth posture? Take the platform. The migration is modular by design — no rip-and-replace required.

RuntimeAI Take

Most Advanced AI Security Zero Trust · Defence in Depth

The middleware CVEs of 2025–2026 are not a string of unlucky bugs — they are the predictable result of running trust-everything convenience layers in the most sensitive position in the stack. RuntimeAI's answer is four independent control layers, each of which would have blunted these specific CVE classes:

The bottom line

Your AI gateway sits in the most credential-rich, most central path in your stack. When it is unauthenticated, secret-holding, and able to spawn commands, a single CVE becomes a CVSS 10.0 breach. The fix is not the next patch — it is a governed, authenticated, isolated router with DLP, a vaulted secret store, and a tamper-proof audit trail. Switch the risky layer, keep everything else, and the breach stops being yours to own.

LiteLLM LangChain Langfuse CVE-2026-42271 LangGrinch CISA KEV Prompt Injection AI Gateway Secure LLM Router RuntimeAI

Your AI middleware shouldn't be the breach.

RuntimeAI's Secure LLM Router, AI Firewall, MCP Gateway, and Audit Black Box let you switch the risky layer — one sub-product at a time — without ripping out your app. Get the AI Security Weekly briefing for the CVEs that matter.

Start Your Trial Secure LLM Router Docs