DEV Community
Grade 10
1h ago
LiteLLM Vulnerability Chain Enables Full AI Gateway Takeover from Default Account
TL;DR what: Three chained vulnerabilities in LiteLLM AI gateway allow default low-privilege users to bypass authorization, escalate to admin, and execute arbitrary code on the server. impact: Full compromise exposes every provider API key (OpenAI, Anthropic, Azure, etc.), database credentials, decryption secrets, and all prompts and responses passing through the gateway. fix: Upgrade immediately to LiteLLM v1.83.14-stable or later, which includes complete fixes for CVE-2026-47101, CVE-2026-47102, and CVE-2026-40217. who: Any organization running LiteLLM proxy to broker AI model access, especially those with internal users or agents routing through the gateway. A critical vulnerability chain in LiteLLM, a widely deployed open-source AI gateway, allows attackers starting from a default low-privilege account to achieve full server takeover and code execution. Obsidian Security researchers disclosed the three-bug chain rated CVSS 9.9, with maintainer BerriAI shipping complete fixes in version 1.83.14-stable on May 2, 2026. LiteLLM brokers API calls to more than 100 AI model providers—OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, and others—behind a single OpenAI-compatible interface. Organizations deploy it as a central gateway to manage costs, enforce policies, and route requests across multiple backends. That centralized position makes it a high-value target: a compromised proxy exposes every provider key it holds, the secrets that decrypt stored credentials, and every prompt and response flowing through it. The Three-Link Chain The attack begins with CVE-2026-47101, an authorization bypass. When a regular internal_user creates a virtual API key, LiteLLM stores the caller-supplied allowed_routes field without validating it against the user's role. This field is intended to restrict what the key can access, but the proxy also treats it as a fallback authorization grant. An attacker can mint a key with allowed_routes: ["/*"], a wildcard that opens every route—including admin-only endpoints. The unchecked write appears across multiple key-management endpoints, requiring three separate pull requests to eliminate. With route restrictions bypassed, the attacker reaches handlers that assume authorization has already been enforced upstream. CVE-2026-47102 exploits the /user/update endpoint, which allows users to edit their own records but does not restrict which fields they can modify. Sending a self-update request with user_role: "proxy_admin" promotes the caller to full proxy admin. VulnCheck scores this privilege escalation 8.7 under CVSS 4.0 and 8.8 under CVSS 3.1. An org_admin can reach this endpoint through legitimate code paths; a default internal_user reaches it only after chaining through the first vulnerability. The final link is CVE-2026-40217, a sandbox escape in LiteLLM's Custom Code Guardrail feature, which compiles and runs admin-supplied Python to enforce safety policies. Production endpoints executed this code through exec() with no source-level filtering. When exec() receives a globals dictionary without builtins , Python silently injects the full builtins module, handing the code access to import , open, and eval. A simple payload calling os.system is enough to pop a reverse shell. A separate vulnerability path on the /guardrails/test_custom_code playground endpoint, found independently by X41 D-Sec, defeated a regex deny-list through runtime bytecode rewriting. Both paths deliver server-side code execution. ⚠️ Blast Radius — A compromised LiteLLM proxy exposes the master key, the salt key that decrypts stored credentials, the database URL, and every configured provider key. Keys stored in config files or environment variables are plaintext; keys in the database are encrypted but recoverable with the salt key. Every prompt, response, PII snippet, code fragment, internal ticket, and pasted secret that passed through the gateway becomes readable. Response Injection: The Sharper Edge The greater risk is not what an attacker reads but what they can rewrite. LiteLLM sits on the wire between AI agents and the model backend, so a compromise allows silent manipulation of responses in transit. Obsidian demonstrated this against Claude Code routed through a compromised proxy. This is not prompt injection—it does not persuade the model to misbehave. Instead, the attacker uses LiteLLM's built-in callback mechanism, an extension point that fires on every request and never appears in the admin UI. The callback swaps the model's genuine response for a forged tool call and rewrites the safety-check context so the action appears pre-approved. In Obsidian's demonstration, a developer types a single word—hello—and the attacker delivers a reverse shell on the developer's machine. The model never sees the malicious command; the proxy injects it downstream. The developer sees reassuring context that suggests the action was verified. This attack surface is unique to gateways that mediate agentic workflows,
TL;DR - what: Three chained vulnerabilities in LiteLLM AI gateway allow default low-privilege users to bypass authorization, escalate to admin, and execute arbitrary code on the server. - impact: Full compromise exposes every provider API key (OpenAI, Anthropic, Azure, etc.), database credentials, decryption secrets, and all prompts and responses passing through the gateway. - fix: Upgrade immediately to LiteLLM v1.83.14-stable or later, which includes complete fixes for CVE-2026-47101, CVE-2026-47102, and CVE-2026-40217. - who: Any organization running LiteLLM proxy to broker AI model access, especially those with internal users or agents routing through the gateway. A critical vulnerability chain in LiteLLM, a widely deployed open-source AI gateway, allows attackers starting from a default low-privilege account to achieve full server takeover and code execution. Obsidian Security researchers disclosed the three-bug chain rated CVSS 9.9, with maintainer BerriAI shipping complete fixes in version 1.83.14-stable on May 2, 2026. LiteLLM brokers API calls to more than 100 AI model providers—OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, and others—behind a single OpenAI-compatible interface. Organizations deploy it as a central gateway to manage costs, enforce policies, and route requests across multiple backends. That centralized position makes it a high-value target: a compromised proxy exposes every provider key it holds, the secrets that decrypt stored credentials, and every prompt and response flowing through it. The Three-Link Chain The attack begins with CVE-2026-47101, an authorization bypass. When a regular internal_user creates a virtual API key, LiteLLM stores the caller-supplied allowed_routes field without validating it against the user's role. This field is intended to restrict what the key can access, but the proxy also treats it as a fallback authorization grant. An attacker can mint a key with allowed_routes: ["/*"], a wildcard that opens every route—including admin-only endpoints. The unchecked write appears across multiple key-management endpoints, requiring three separate pull requests to eliminate. With route restrictions bypassed, the attacker reaches handlers that assume authorization has already been enforced upstream. CVE-2026-47102 exploits the /user/update endpoint, which allows users to edit their own records but does not restrict which fields they can modify. Sending a self-update request with user_role: "proxy_admin" promotes the caller to full proxy admin. VulnCheck scores this privilege escalation 8.7 under CVSS 4.0 and 8.8 under CVSS 3.1. An org_admin can reach this endpoint through legitimate code paths; a default internal_user reaches it only after chaining through the first vulnerability. The final link is CVE-2026-40217, a sandbox escape in LiteLLM's Custom Code Guardrail feature, which compiles and runs admin-supplied Python to enforce safety policies. Production endpoints executed this code through exec() with no source-level filtering. When exec() receives a globals dictionary without builtins, Python silently injects the full builtins module, handing the code access to import, open, and eval. A simple payload calling os.system is enough to pop a reverse shell. A separate vulnerability path on the /guardrails/test_custom_code playground endpoint, found independently by X41 D-Sec, defeated a regex deny-list through runtime bytecode rewriting. Both paths deliver server-side code execution. ⚠️ Blast Radius — A compromised LiteLLM proxy exposes the master key, the salt key that decrypts stored credentials, the database URL, and every configured provider key. Keys stored in config files or environment variables are plaintext; keys in the database are encrypted but recoverable with the salt key. Every prompt, response, PII snippet, code fragment, internal ticket, and pasted secret that passed through the gateway becomes readable. Response Injection: The Sharper Edge The greater risk is not what an attacker reads but what they can rewrite. LiteLLM sits on the wire between AI agents and the model backend, so a compromise allows silent manipulation of responses in transit. Obsidian demonstrated this against Claude Code routed through a compromised proxy. This is not prompt injection—it does not persuade the model to misbehave. Instead, the attacker uses LiteLLM's built-in callback mechanism, an extension point that fires on every request and never appears in the admin UI. The callback swaps the model's genuine response for a forged tool call and rewrites the safety-check context so the action appears pre-approved. In Obsidian's demonstration, a developer types a single word—hello—and the attacker delivers a reverse shell on the developer's machine. The model never sees the malicious command; the proxy injects it downstream. The developer sees reassuring context that suggests the action was verified. This attack surface is unique to gateways that mediate agentic workflows, where models call tools and execute code based on their own responses. Beyond the Chain: Design-Level Risk Separate from the patched chain, LiteLLM grants proxy_admin users an intentional code-execution path through its Model Context Protocol (MCP) support. Admins can register stdio MCP servers, which the proxy launches as local subprocesses. This is a design trade-off, not a bug, and the patches do not change it. Reaching admin effectively means reaching code execution. Obsidian reproduced a reverse shell on v1.88.0 using this mechanism. A related vulnerability in the same stdio-MCP machinery, CVE-2026-42271, allowed callers to spawn subprocesses through LiteLLM's MCP preview endpoints; it was exploited in the wild and added to CISA's Known Exploited Vulnerabilities catalog earlier this month. Context: LiteLLM's Turbulent 2026 — This disclosure is the latest in a difficult year for LiteLLM. In March 2026, a supply-chain compromise backdoored two LiteLLM releases on PyPI. In April, a critical SQL injection vulnerability was exploited within 36 hours of public disclosure. Obsidian frames the current chain as a disclosed flaw with a working proof-of-concept, not as in-the-wild exploitation, but the proxy's central position in AI infrastructure continues to make it an attractive target. Remediation Steps Upgrade immediately to LiteLLM v1.83.14-stable or later. This is the first release containing the complete fix set for all three vulnerabilities. GitHub lists the release date as May 2, 2026. After upgrading, conduct a full audit of your deployment. - Re-verify every account holding the proxy_admin role and treat that role as equivalent to host-level access. - Review every Custom Code Guardrail configured on the proxy for unexpected or malicious logic. - Check the callbacks loaded from config.yaml under litellm_settings.callbacks—these never appear in the web console and are exactly where a post-compromise attacker would inject persistence. - Verify the integrity of the deployed code, not just configuration files. - If you suspect exposure, rotate all provider API keys, database credentials, and any stored MCP or OAuth tokens. Root Cause: Misplaced Trust The chain succeeds because of misplaced trust at every layer. The route gate trusted a caller-supplied field to define its own authorization scope. Downstream handlers trusted that the route gate had already enforced access control. The sandbox trusted Python's exec() to isolate untrusted code without providing a proper restricted environment. No layer validated the assumptions made by the layer above it. This is a textbook defense-in-depth failure: when the outermost control is bypassed, nothing behind it offers resistance. AI gateways occupy a uniquely sensitive position in modern infrastructure. They hold the keys to every model provider, see every user interaction with those models, and—in agentic workflows—mediate the responses that drive code execution and tool use. A compromise does not just leak data; it allows silent, persistent manipulation of the intelligence layer that organizations are beginning to trust with autonomous decisions. The LiteLLM chain is a reminder that infrastructure built to manage AI risk can itself become the highest-risk component in the stack. Originally published on RedEye Threat Intelligence. Top comments (0)
Comments
No comments yet. Start the discussion.