LiteLLM Is a CVSS 10.0 Liability. LangChain Leaks Your Secrets. Langfuse Inherits Both.

Active Exploitation · CISA KEV June 9, 2026

The AI middleware layer — the LiteLLM proxy, LangChain/LangGraph orchestration, and Langfuse observability that sit between your application and the model providers — has quietly become the soft underbelly of the enterprise AI stack. A wave of 2025–2026 CVEs proves it: one chain already reaches CVSS 10.0 and has been added to the CISA Known Exploited Vulnerabilities catalog, with confirmed exploitation in the wild. These are widely adopted, capable open-source projects — the issue is not the maintainers, it is the architectural posture of running unauthenticated, trust-everything middleware in front of your model providers and your secrets.

Most teams adopted this stack for convenience: one base URL to route across OpenAI, Anthropic, Bedrock, Azure, and Vertex; one framework to chain prompts and tools; one dashboard to watch tokens and latency. The convenience is real. But the security model underneath it was never designed for a world where a poisoned model response, a manipulated Host header, or a single error path can hand an attacker remote code execution or a wallet full of provider keys.

10.0

CVSS — LiteLLM chained
unauthenticated RCE

9.3

CVSS — LangChain
"LangGrinch"

KEV

CISA-listed
June 9, 2026

LiteLLM: From AI Gateway to Unauthenticated RCE

LiteLLM is the most widely deployed open-source AI gateway/proxy — a single OpenAI-compatible endpoint that fans out to dozens of providers. That central position is exactly why a flaw in it is catastrophic: it sits in the path of every model call, and it holds every provider key. In 2026 it accumulated a set of vulnerabilities that, taken together, turn the gateway itself into the breach.

CVE-2026-42271 Command Injection · KEV

LiteLLM MCP server test endpoints spawn arbitrary commands

LiteLLM's MCP server test endpoints accept a full server configuration — command, args, and environment — for the stdio transport, then spawn the supplied command as a subprocess on the proxy host. There is no sanitization. Affected versions: LiteLLM 1.74.2 through 1.83.6. CISA added it to the KEV catalog on June 9, 2026, confirming active exploitation in the wild.

CVE-2026-48710 Host-Header Auth Bypass

Starlette "BadHost" validation bypass

A Host-header validation bypass in Starlette (≤ 1.0.0), the ASGI framework underneath LiteLLM's proxy. By manipulating the Host header, an attacker can sidestep authentication entirely — the gate that was supposed to protect the dangerous endpoints simply does not fire.

Individually these are serious. Chained, they are terminal. Researchers at Horizon3.ai confirmed that combining CVE-2026-42271 with CVE-2026-48710 yields a CVSS 10.0, zero-credential, unauthenticated remote code execution path on vulnerable LiteLLM deployments. The exploit chain reads like a checklist of everything an AI gateway should never allow:

Bypass authentication with a forged Host header

Using CVE-2026-48710, the attacker manipulates the Host header so Starlette's validation passes the request through without credentials. No API key, no session, no token. The authentication layer is simply not in the path anymore.

Reach the unguarded MCP test endpoint

With auth bypassed, the attacker hits the MCP server test endpoint (CVE-2026-42271), which accepts a full stdio server config including the command to run.

Spawn an arbitrary command on the proxy host

The endpoint dutifully launches the supplied command as a subprocess on the LiteLLM host. No sanitization, no allow-list, no sandbox. This is full RCE on the most central, most credential-rich node in the AI stack.

Harvest every provider key the proxy holds

A LiteLLM proxy stores keys for every model provider it routes to. From RCE on that host, an attacker reads them all out of memory or environment — OpenAI, Anthropic, Bedrock, Azure, Vertex — plus whatever cloud IAM the host can reach.

The fix arrived in LiteLLM 1.83.7 (May 8, 2026), which added authorization controls and updated Starlette. But patching one chain does not change the posture, and 2026 was not LiteLLM's first incident.

The March 2026 supply-chain compromise

In March 2026, attackers slipped malicious code into official LiteLLM releases on PyPI. The payload was a credential stealer — designed to exfiltrate secrets across cloud environments, CI/CD pipelines, and developer machines that installed the poisoned package. Trend Micro, Snyk, and HeroDevs all reported on the compromise. For a package that, by design, is wired into the most secret-rich path in your infrastructure, a single tainted release is a direct line to your provider keys and cloud credentials.

CVE-2025-0330 Credential Leak

LiteLLM leaks Langfuse API keys through an error path

In proxy_server.py (v1.52.1), an error while parsing team settings exposes langfuse_secret and langfuse_public_key. A failure in the gateway's config handling silently burns the credentials of the observability layer bolted to it — the first sign of just how tightly coupled, and how mutually dependent for failure, these projects are.

The pattern, not the project

None of this means LiteLLM is badly built — it is a capable, popular gateway. It means an unauthenticated, secret-holding, command-spawning proxy at the center of your AI stack is a structurally dangerous place to keep convenience. When the gateway is breached, everything behind it is breached. The fix is not "patch and pray for the next CVE"; it is to put a governed, authenticated, isolated router in that position instead.

LangChain: Prompt Injection That Becomes Code Execution

LangChain (and LangGraph) is an orchestration library, not a gateway — it lives inside your application process, chaining prompts, tools, and state. That makes its vulnerabilities a different shape: instead of an external network attacker, the threat rides in on the data your application already trusts — specifically, model output.

CVE-2025-68664 "LangGrinch" CVSS 9.3 Critical

Serialization injection in LangChain dumps()/dumpd()

LangChain's serialization helpers (dumps()/dumpd()) do not escape free-form dictionaries containing lc keys. As a result, user-controlled data is later deserialized as a trusted LangChain object. Impact: extraction of secrets from the environment (when secrets_from_env=True), instantiation of arbitrary classes in trusted namespaces (langchain_core, langchain, langchain_community), and remote code execution via Jinja2 templates. Reported December 4, 2025 by Yarden Porat.

The vector is what makes LangGrinch so dangerous in agentic systems. The primary entry point is the LLM response itself — fields like additional_kwargs and response_metadata — which are controlled via prompt injection and then serialized and deserialized during streaming. The chain looks like this:

Prompt injection poisons the model's response fields

An attacker plants instructions in any untrusted content the model ingests — a web page, a document, an email, a tool result. The model emits attacker-controlled data into additional_kwargs / response_metadata.

LangChain serializes and re-deserializes the poisoned object

During streaming, those response fields pass through dumps()/dumpd() and back. Because lc-keyed dicts are not escaped, the poisoned payload is reconstituted as a trusted LangChain object.

Secret exfiltration or RCE

The deserializer can read secrets from the environment, instantiate classes in trusted namespaces, or render a Jinja2 template — the path to remote code execution. A prompt-injection string has become code running in your application.

CVE-2025-68665 CVSS 8.6

The same LangGrinch flaw in LangChain.js

The identical serialization-injection class affects the JavaScript/TypeScript implementation, rated CVSS 8.6 — so Node-based agent stacks inherit the same prompt-injection-to-RCE path.

CVE-2025-67644 SQL Injection

SQLi in LangGraph's SQLite checkpoint implementation

LangGraph's SQLite checkpointer is vulnerable to SQL injection. Agent state — which often carries untrusted, model-influenced content — flows into the checkpoint store, giving an attacker a path to tamper with or read persisted graph state.

CVE-2026-34070 Path Traversal

Path traversal in LangChain enables arbitrary file access

A path-traversal flaw allows arbitrary file access without validation — another route by which model-influenced input reaches the filesystem in ways the application never intended.

The patches are the right shape: a new allowed_objects allowlist in load()/loads(), Jinja2 blocked by default, and secrets_from_env now defaulting to False. But the lesson stands — in an agentic system, model output is untrusted input, and any layer that deserializes it as trusted is one prompt-injection away from code execution. That defense belongs in front of orchestration, not inside it.

Langfuse & the Coupling Problem

Langfuse is the observability layer many of these stacks bolt on — traces, token counts, latency, prompt history. The problem is not a single dramatic RCE; it is coupling. Observability is only as secure as the gateway it watches, and the gateway is only as secure as its worst error path.

CVE-2025-0330 above is the clearest illustration: a flaw in the LiteLLM gateway leaks the Langfuse public and secret keys through an error path. A bug in one product burns the credentials of the adjacent one. When you bolt observability onto an insecure gateway, the observability layer inherits the gateway's blast radius — and its keys.

The deployment surface adds to it. Self-hosted Langfuse Docker images shipped with vulnerable Node/npm packages; the issue was addressed in release 3.143.0. The broader pattern is unmistakable: the adjacent low-code AI tool Langflow — a different product, not to be confused with Langfuse — also had a vulnerability added to the CISA KEV catalog in May 2026. Across the AI middleware ecosystem, unsecured tooling is being actively exploited.

The coupling tax

Three products, three trust boundaries that don't actually exist. The gateway holds the observability keys. The orchestration library trusts model output. The observability layer trusts the gateway. A breach anywhere is a breach everywhere. Decoupling these layers — with authentication, isolation, and a vaulted secret store between each — is the only thing that turns one compromise into a contained one.

Thorough Comparison

Here is a fair, factual comparison across the four layers. LangChain is an orchestration library, not a gateway, so several gateway-specific rows are marked N/A for it — that is not a knock, it is a category difference. The point is the security posture each layer brings by default.

Capability	LiteLLM proxy	LangChain	Langfuse	RuntimeAI
Authentication model	API key; bypassable via Host-header (CVE-2026-48710)	N/A (in-process library)	Public/secret key pair; leaked via gateway error path	Mandatory, identity-bound JWT + tenant context; Bot-CA mTLS for agents
Supply-chain integrity	PyPI compromise (Mar 2026) shipped credential stealer	Multiple CVEs in core + community packages	Vulnerable Node/npm in Docker images (< 3.143.0)	Signed builds; PQC-enveloped secrets never in plaintext for a stealer to grab
MCP handling	Test endpoint spawns arbitrary commands (CVE-2026-42271)	Delegates to host MCP client	N/A	Governed MCP Gateway: tenant ACLs, validation, audit, mTLS; no arbitrary-command path
Secret storage	Provider keys in plaintext env; leaked on error	Reads secrets from env (`secrets_from_env`)	Keys leaked by coupled gateway	QuantumVault / PQ TokenVault: PQC-enveloped, short-lived, minted on demand
Multi-tenancy / isolation	Shared proxy; broad blast radius	In-process; app-defined	Project-scoped; shared infra	Multi-tenant RLS enforced at the data layer
DLP & prompt-injection defense	None by default	None — trusts model output (LangGrinch)	None (observe-only)	Bidirectional AI Firewall: inbound/outbound scan, PII redaction, injection detection
Audit trail	Request logs	Application-defined callbacks	Trace store (plaintext key coupling)	Tamper-proof Audit Black Box: Merkle-chained, PQ-Sign timestamped
Kill switch	None native	N/A	N/A	Instant L1/L2/L3 revocation of provider, agent, or credential
On-prem / sovereign	Self-host proxy	Self-host library	Self-host (vulnerable image history)	First-class routing to on-prem Sovereign LLM Platform (3 tiers), air-gap capable
Cost governance	Basic cost tracking	N/A	Observe-only token/latency	Real-time tracking + budget enforcement + policy-driven routing
Failover / resilience	Provider fallbacks	N/A	N/A	Retry with backoff on provider errors

And on the feature axis people actually adopt LiteLLM for — multi-provider routing and cost control — RuntimeAI's Secure LLM Router matches the convenience and adds the governance that was missing:

Routing / cost / failover feature	LiteLLM proxy	RuntimeAI Secure LLM Router
Multi-provider routing	Yes (OpenAI, Anthropic, Bedrock, Azure, Vertex, vLLM/Ollama)	Yes — 17 wired providers + self-hosted vLLM/Ollama, one OpenAI-compatible base URL (full list below)
Routing strategy	Cost / latency / load	Policy-driven (OPA/Rego), compliance- & latency-aware
Cost tracking	Yes	Real-time + budget enforcement
On-prem / sovereign target	Self-host models	First-class routing to Sovereign LLM Platform (air-gapped/regulated)
Auth & isolation	API key (bypassable)	JWT + tenant context + Bot-CA mTLS; multi-tenant RLS

How the Router Handles a New Model Drop

A fair question when you put a gateway in your critical path: what happens the day OpenAI or Anthropic ships a new model? With many setups the answer is “wait for an SDK bump or a code change.” With the Secure LLM Router, the answer is nothing.

The router is a governed passthrough. The model name in your request is forwarded to the provider you configured, so a new model from a provider you already use works the moment that provider exposes it — no redeploy, no SDK upgrade. Three governance layers wrap that passthrough without getting in its way:

Policy gate — per-tenant and per-agent allow/deny lists decide which models are usable. Default-open, so new models flow through; lock specific models down (or pin an allowlist) when governance requires it.
Cost attribution — register the new model’s price so spend and budgets stay accurate. Routing works either way; this just keeps the ledger complete.
Model-deprecation policy — schedule retirement of an old model with automatic migration to its replacement, so you are never pinned to a model a provider is sunsetting.

Wired providers today: OpenAI, Anthropic, Azure OpenAI, Google Gemini & Vertex, Mistral, Cohere, Groq, Together, Fireworks, Perplexity, DeepSeek, xAI, Cerebras, SambaNova, OpenRouter, and AI21 — plus any OpenAI-compatible self-hosted endpoint (vLLM, Ollama, TGI) and RuntimeAI’s on-prem Sovereign LLM Platform. Providers with non-standard auth get a dedicated adapter; OpenAI-compatible providers route through the unified path. The honest version of “works with everything” is this: any model from a provider you’ve connected works without a code change — and connecting a new OpenAI-compatible provider is a config entry, not a release.

It's Not Just These Three — Map for the Whole Middleware Layer

LiteLLM, LangChain/LangGraph, and Langfuse are the most visible names, but the same governance gap runs through the entire AI-middleware category. Whatever you run today, there is a governed RuntimeAI layer to swap it for — one at a time:

Middleware category	Common tools	RuntimeAI layer to switch to
AI gateway / model router	LiteLLM, Portkey, Kong AI Gateway, Cloudflare AI Gateway, OpenRouter, Bifrost	Secure LLM Router
Agent orchestration	LangChain, LangGraph, LlamaIndex	Keep it — front it with AI Firewall + PII Shield + Flow Enforcer
Observability / tracing	Langfuse, Helicone, LangSmith, Arize Phoenix	Agent Observability + Audit Black Box (tamper-proof)
Guardrails / safety	Guardrails AI, NeMo Guardrails, Lakera	AI Firewall (bidirectional DLP, prompt-injection, content policy)
MCP / tool routing	LiteLLM MCP, raw MCP servers	MCP Gateway (Bot-CA mTLS, tenant ACLs, audit)
LLM key storage	.env files, plaintext config, k8s secrets	QuantumVault / PQ TokenVault (PQC-enveloped, short-lived)
Agent memory	Ad-hoc vector stores	Memory Vault (governed writes, injection detection, TTL)

You Don't Have to Rip It Out — Switch One Layer at a Time

This is the part that matters most. You do not have to do a forklift migration to fix this. The AI middleware stack is modular, and so is the fix. Keep your application, your prompts, and your agents exactly as they are — and swap out the risky layer underneath them. Adopt a single RuntimeAI sub-product to close one specific CVE class, or adopt the whole platform for defence-in-depth. The migration is incremental and reversible.

The smallest possible change is a base-URL swap. The Secure LLM Router is OpenAI-compatible, so most apps move with one environment variable — and the provider key stops living in plaintext:

# Before — LiteLLM proxy: shared key in env, unauthenticated control plane
OPENAI_API_BASE=http://litellm-proxy:4000/v1
OPENAI_API_KEY=sk-litellm-master-key        # long-lived, sits in .env

# After — RuntimeAI Secure LLM Router (drop-in, OpenAI-compatible)
OPENAI_API_BASE=https://router.runtimeai.io/v1
OPENAI_API_KEY=${RUNTIMEAI_TOKEN}           # short-lived JWT, minted by QuantumVault
# every call now carries tenant context + Bot-CA mTLS; your routing
# and cost policy are unchanged — only the security model is.

Replace → LiteLLM proxy

RuntimeAI Secure LLM Router — drop-in OpenAI-compatible base URL

Point your existing OpenAI-compatible client at the RuntimeAI Secure LLM Router instead of the LiteLLM proxy. Same multi-provider routing and policy-driven strategies — but every call carries a JWT + tenant context, there is no Host-header auth-bypass class, and there is no "spawn a command from a test endpoint" path. Closes the CVE-2026-42271 + CVE-2026-48710 chain.

Front → LangChain orchestration

AI Firewall + PII Shield + Flow Enforcer — catch poisoned response fields

Keep LangChain/LangGraph. Put RuntimeAI's bidirectional AI Firewall, PII Shield, and Flow Enforcer in front of it: tokenize sensitive content before it reaches the LLM, and scan model output for injection patterns before it reaches your deserializer. The LangGrinch vector — a poisoned additional_kwargs / response_metadata field — is caught at the egress/ingress boundary, before it is reconstituted as a trusted object.

Replace → Langfuse

Agent Observability + Audit Black Box — tamper-proof, no key coupling

Swap Langfuse for RuntimeAI's Agent Observability (OTEL metrics and traces) plus the Audit Black Box — a Merkle-chained, PQ-Sign-timestamped audit trail. You get the dashboards and traces you relied on, without the plaintext-key coupling that let a gateway error path leak the observability layer's credentials.

Replace → LiteLLM MCP

RuntimeAI MCP Gateway — governed, Bot-CA mTLS

Move MCP server connectivity behind the RuntimeAI MCP Gateway. MCP servers run with tenant ACLs, rate limits, full audit, and Bot-CA mutual TLS; every tool call is validated before it reaches a server. There is no unauthenticated test endpoint that accepts a command and args to spawn — the entire CVE-2026-42271 attack surface simply does not exist.

Replace → plaintext env keys

QuantumVault / PQ TokenVault — PQC-enveloped, short-lived secrets

Stop keeping provider keys in plaintext environment variables where a credential stealer (the March 2026 supply-chain payload) or an error path (CVE-2025-0330) can read them. Provider keys live in QuantumVault / PQ TokenVault, PQC-enveloped and minted short-lived on demand. There is no static plaintext key for a stealer or a stack trace to leak.

Adopt incrementally, or all at once

Each card above is independently deployable. Worried only about the CVSS 10.0 RCE? Start with the Secure LLM Router and MCP Gateway. Worried about LangGrinch and prompt injection? Front your existing LangChain with the AI Firewall + Flow Enforcer. Want the whole defence-in-depth posture? Take the platform. The migration is modular by design — no rip-and-replace required.

RuntimeAI Take

Most Advanced AI Security Zero Trust · Defence in Depth

The middleware CVEs of 2025–2026 are not a string of unlucky bugs — they are the predictable result of running trust-everything convenience layers in the most sensitive position in the stack. RuntimeAI's answer is four independent control layers, each of which would have blunted these specific CVE classes:

Discovery / Inventory. Every model call, MCP server, and agent is registered and identity-bound through the Secure LLM Router and MCP Gateway. There is no anonymous caller and no shadow MCP test endpoint — the Host-header bypass and command-injection surface (CVE-2026-48710 / 42271) have nothing unauthenticated to reach.
Behavioural enforcement. Flow Enforcer applies OPA/Rego policy to every tool call and route. Shell-spawning configs, unsanitized parameters, and out-of-policy provider routes are rejected before execution — the exact decision LiteLLM's test endpoint never made.
Flow / egress control. The bidirectional AI Firewall + PII Shield scan inbound and outbound content, redact PII, and detect injection patterns. A poisoned LLM response field — the LangGrinch (CVE-2025-68664) vector — is caught before it reaches a vulnerable deserializer, and provider keys live in QuantumVault, not plaintext env, so a stealer or error path (CVE-2025-0330, the supply-chain payload) finds nothing to exfiltrate.
Immutable audit trail. The Audit Black Box records every call, route, and tool invocation in a Merkle-chained, PQ-Sign-timestamped log — the secure replacement for Langfuse-style observability, with no plaintext-key coupling. If a provider, agent, or credential is compromised, the Kill Switch revokes it instantly at L1/L2/L3.

The bottom line

Your AI gateway sits in the most credential-rich, most central path in your stack. When it is unauthenticated, secret-holding, and able to spawn commands, a single CVE becomes a CVSS 10.0 breach. The fix is not the next patch — it is a governed, authenticated, isolated router with DLP, a vaulted secret store, and a tamper-proof audit trail. Switch the risky layer, keep everything else, and the breach stops being yours to own.

LiteLLM LangChain Langfuse CVE-2026-42271 LangGrinch CISA KEV Prompt Injection AI Gateway Secure LLM Router RuntimeAI

Your AI middleware shouldn't be the breach.

RuntimeAI's Secure LLM Router, AI Firewall, MCP Gateway, and Audit Black Box let you switch the risky layer — one sub-product at a time — without ripping out your app. Get the AI Security Weekly briefing for the CVEs that matter.

Start Your Trial Secure LLM Router Docs

LiteLLM Is a CVSS 10.0 Liability. LangChain Leaks Your Secrets.
Langfuse Inherits Both.
Your AI gateway shouldn't be the breach. Here's how to switch — one sub-product at a time.