Back to Blog
OWASP ASI05 Explained: AI Agent RCE Patterns
AI AuditsAIMCPSecurity Checklist

OWASP ASI05 Explained: AI Agent RCE Patterns

12 min

TL;DR

  • OWASP ASI05 ("Unexpected Code Execution") is item 5 of the OWASP Top 10 for Agentic Applications 2026. It covers attacks where an AI agent executes attacker-influenced code, command-line invocations, or shell calls that the operator never intended to authorise.
  • ASI05 fires through three principal vectors: agent-generated code that runs without review, exec primitives in tool surfaces that take attacker-controllable inputs, and chained tool composition that produces command execution out of individually-safe primitives.
  • The 2025–2026 record contains at least 5 Critical-rated ASI05 CVEs in the MCP ecosystem alone: CVE-2025-49596 (MCP Inspector, 9.4), CVE-2025-6514 (mcp-remote, 9.6), CVE-2025-59528 (Flowise, 10.0), CVE-2026-0755 (gemini-mcp-tool, 9.8), CVE-2026-33032 (nginx-ui MCP, 9.8).
  • ASI05 is the threat class that turns "AI agent" from a productivity tool into "remote shell for the attacker who can influence its prompt context."
  • Mitigation requires structural separation between agent reasoning and code execution: sandboxed exec primitives, explicit human-in-the-loop for code generation that runs, and host-level isolation between the agent process and any privileged subsystem it can reach.

What ASI05 actually says

OWASP ASI05 covers the class of failures where an AI agent ends up executing code, commands, or scripts that the operator did not explicitly authorise. The "unexpected" framing matters: ASI05 is not about agents that have an exec tool and use it as designed — it is about exec paths the operator did not realise were exec paths, or that an attacker can drive the agent into reaching despite the agent being "safe" in normal use.
Three principal failure modes inside the category:
Agent-generated code that runs without review. The agent writes Python/JavaScript/shell to "solve" a task and the runtime executes it. Each input the agent processes is potential influence over what gets generated, and most production deployments do not gate generated-code execution on human review.
Exec primitives in tool surfaces. Tools that wrap shell commands, subprocess spawn, or interpreted-code execution take attacker-controllable inputs. Classical command injection re-emerges in every poorly-validated wrapper. The April 2026 Anthropic SDK configuration-channel injection is the largest-magnitude example, but the pattern recurs across many connectors.
Chained tool composition. Tools that individually do not execute code can compose into execution. A Git hook write tool plus a Git operation tool produces RCE through Git's hook mechanism. A file-write tool plus a tool that triggers a file-watch handler produces execution through the watcher's auto-action. ASI05 explicitly names compound execution paths as in-scope.

How ASI05 differs from adjacent OWASP items

ASI05 overlaps with several other items in the OWASP Top 10 for Agentic Applications. The cleanest separation:
  • ASI04 (Agentic Supply Chain) is about who shipped the tool. ASI05 is about what executes when the tool runs. Many CVEs fit both — a trojanised connector that produces RCE is ASI04 by origin and ASI05 by impact.
  • ASI02 (Tool Misuse) is about runtime use of intentional tool surfaces. ASI05 is about execution paths the operator did not realise existed. A sandbox escape from a tool that was supposed to be read-only is ASI05; misusing an explicit exec tool is ASI02.
  • ASI01 (Agent Goal Hijack) is about what the agent decides to do. ASI05 is about what the agent's environment lets that decision turn into execution. Goal-hijacking that does not reach an exec primitive is contained at ASI01; goal-hijacking that does is ASI01 + ASI05.

Real-world ASI05 CVEs

The disclosed-incident record from 2025–2026 contains at least five MCP-ecosystem CVEs that fit cleanly inside ASI05, each documented in the MCP Breach Index 2025–2026:

CVE-2025-49596 — MCP Inspector RCE (Critical 9.4)

Anthropic's official MCP Inspector contained an unauthenticated RCE in its proxy architecture. Documented in the MCP Inspector RCE writeup. Impact: developer-machine compromise of anyone running Inspector versions <0.14.1.

CVE-2025-6514 — mcp-remote OAuth shell injection (Critical 9.6)

The mcp-remote npm package handled OAuth flows for remote MCP servers. A crafted authorization_endpoint value triggered OS command execution during the connection handshake. Pure ASI05: classical command injection in a path the operator did not realise was an exec path.

CVE-2025-59528 — Flowise CustomMCP node (Critical 10.0)

The CustomMCP node in Flowise 3.0.5 used the JavaScript Function() constructor over unsanitised input, producing direct RCE. The maximum-severity rating (10.0) reflects the attack-complexity (low), authentication required (none), and impact (full).

CVE-2026-0755 — gemini-mcp-tool command injection (Critical 9.8)

execAsync in the gemini-mcp-tool package passed user-supplied input directly to a system call. Look-alike-package squatting amplified the exposure: a malicious clone could distribute the same primitive to typo-prone installers.

CVE-2026-33032 — nginx-ui MCP endpoint (Critical 9.8)

The nginx-ui MCP endpoint executed commands without authentication. Auth-bypass + exec primitive in one component is the cleanest possible ASI05 finding.

April 2026 Anthropic SDK design flaw

Multiple CVEs (CVE-2025-65720, CVE-2026-30615, -30617, -30618, -30623, -30624, -30625, -33224, -26015) all instances of configuration-channel injection in the official MCP SDKs, declared by-design by Anthropic. Detailed in the April 2026 Anthropic MCP SDK writeup.

Why agentic systems amplify ASI05 risk

Three properties make agentic systems structurally more prone to unexpected code execution than classical software.
Agents reason about exec. A traditional program either calls exec or it doesn't, deterministically. An agent's reasoning step can decide to call exec — including in cases the operator did not explicitly authorise — based on prompt context that may include adversarial input. Every exec primitive the agent can reach is potentially reachable by any successful prompt injection.
Tool surfaces grow over time. A production agent typically connects multiple MCP servers, plugins, and runtime tools. Each connection extends the tool surface. The operator who approved the surface at install time rarely re-audits it after every update. New exec primitives can enter the surface without explicit operator review.
Compositional execution is invisible. The agent can chain tools across servers as part of a single planning step. Composition that produces execution is rarely modelled by the host. Git hook + Git operation produces execution through a path neither tool individually authorised.

Detection and mitigation

Defending against ASI05 requires structural separation, not just input validation. The four operational controls below cover the disclosed-CVE record:

1. Sandbox every exec primitive

Every tool surface that calls a shell, spawns a subprocess, or invokes an interpreter (Function(), eval, exec()) must run in a sandbox at the kernel level — containers, seccomp, AppArmor/SELinux confinement, or platform-equivalent isolation. The goal: even successful execution against the sandbox does not compromise the host's credentials, network, or filesystem outside the sandbox.

2. Allowlist arguments before exec

Every input that flows into a spawn or shell call must be parsed against a strict allowlist before reaching the call. The allowlist defines exact executable paths, exact argument shapes, and exact environment-variable inheritance rules. Anything outside the allowlist is rejected before the exec primitive is reached.

3. Human-in-the-loop for agent-generated code execution

Code the agent generates and intends to run should not auto-execute. Surface it to the user with the source, the inputs that produced it, and the expected effect. Wait for explicit approval. The cost is friction; the benefit is that prompt injection cannot drive the agent to execute attacker-chosen code without a human checkpoint.

4. Composition policy at the host level

The host should explicitly model which tool combinations are allowed in a single agent step. Tools that touch external content should not auto-compose with tools that have exec primitives. Cross-trust-boundary chains (untrusted-input → exec) should require explicit confirmation. The Git hook + Git operation chain would be blocked by a composition policy that flags any combination of "tool that can write hook directories" + "tool that triggers Git operations".
For Web3 and DeFi teams specifically, an additional rule applies: agents that hold transaction-signing authority or wallet access must run in a process boundary distinct from any tool surface that can reach an exec primitive. The composition is too dangerous to gate with prompt-level controls.

Get funded for your audit

Core grants cover up to $32k. Growth and Builder tiers available. Rolling applications.

No spam. Unsubscribe anytime.


How Zealynx audits for ASI05

A Zealynx MCP Security Audit treats ASI05 as a structural-isolation audit. The five focused tests:
  1. Exec-primitive enumeration. For each connected tool, identify every path that could reach a shell, subprocess spawn, or interpreted-code execution. Most teams have never produced this list.
  2. Sandbox effectiveness test. For each identified exec primitive, verify it runs in a kernel-level sandbox. Compromise of the sandbox should not compromise the host.
  3. Argument-validation review. Trace every input that reaches an exec primitive through whatever validation layers exist. Flag any path where unvalidated input can influence the executed command.
  4. Generated-code review-gate test. Verify that agent-generated code does not auto-execute. Where it does, the generation flow is a confirmed ASI05 finding regardless of any other controls.
  5. Composition-policy verification. Test which tool combinations the host actually allows. Flag any combination of (untrusted-input source) + (exec primitive) that is not gated by explicit confirmation.
The deliverable maps each finding to ASI05 (and ASI04 / ASI02 where relevant) with prioritised remediation guidance.

FAQ

1. What is OWASP ASI05 in one sentence?
OWASP ASI05 (Unexpected Code Execution) is item 5 of the OWASP Top 10 for Agentic Applications, covering attacks where an AI agent executes attacker-influenced code, command-line invocations, or shell calls that the operator never intended to authorise — through agent-generated code that runs without review, exec primitives in tool surfaces that take attacker-controllable inputs, or chained tool composition that produces command execution out of individually-safe primitives.
2. What real-world CVEs fit OWASP ASI05?
At least 5 Critical-rated MCP CVEs fit ASI05: CVE-2025-49596 (MCP Inspector unauthenticated RCE, CVSS 9.4), CVE-2025-6514 (mcp-remote OAuth shell injection, CVSS 9.6), CVE-2025-59528 (Flowise CustomMCP Function() over unsanitised input, CVSS 10.0), CVE-2026-0755 (gemini-mcp-tool command injection, CVSS 9.8), and CVE-2026-33032 (nginx-ui MCP auth-bypass to RCE, CVSS 9.8). The April 2026 Anthropic SDK configuration-channel injection cluster also fits.
3. How is ASI05 different from ASI04?
ASI04 (Agentic Supply Chain) is about who shipped the tool — provenance, signing, registry trust. ASI05 (Unexpected Code Execution) is about what executes when the tool runs — exec primitives, generated-code execution paths, composition-driven execution. Many CVEs fit both: a trojanised connector that produces RCE is ASI04 by origin and ASI05 by impact. The fixes differ — ASI04 requires supply-chain hygiene, ASI05 requires structural isolation.
4. Why are agentic systems more prone to RCE than classical software?
Agentic systems are more prone to RCE because the agent's reasoning step can decide to call exec primitives based on prompt context that may include adversarial input — every exec primitive the agent can reach is potentially reachable by any successful prompt injection. Tool surfaces grow over time as more connectors are added without re-audit. And tool composition can produce execution through paths neither individual tool authorised, making the exec attack surface compositional rather than enumerable.
5. How do I prevent ASI05 in my agent deployment?
Sandbox every exec primitive at the kernel level (containers, seccomp, AppArmor/SELinux), allowlist arguments before any spawn or shell call, require human-in-the-loop confirmation for agent-generated code execution, and enforce a host-level composition policy that blocks combinations of (untrusted-input source) + (exec primitive) without explicit user confirmation. For Web3 deployments, add the unconditional rule that agents holding signing authority must run in a process boundary distinct from any tool surface that can reach an exec primitive.
6. What is "agent-generated code execution"?
Agent-generated code execution is the pattern where an AI agent writes code (Python, JavaScript, shell, SQL) to "solve" a task and the agent runtime executes it without human review. Every input the agent processes — including documents, tool outputs, web pages — can influence what gets generated. Without a human-in-the-loop checkpoint between generation and execution, prompt injection in any input can drive the agent to execute attacker-chosen code, with the host's full authority.
7. Are MCP server CVEs always ASI05 findings?
Not always — many MCP CVEs are credential-leak, supply-chain, or access-control findings that don't involve unexpected code execution. But a substantial subset do: any CVE where the impact section reads "remote code execution," "command injection," "shell injection," or "RCE" is an ASI05 finding regardless of how the attacker reached the exec primitive. The 5 Critical-rated MCP CVEs cited above are all ASI05; a much larger set of High-rated CVEs also fit.
8. How does Zealynx audit for ASI05?
Zealynx's MCP Security Audit tests for ASI05 across five dimensions: exec-primitive enumeration (mapping every path that reaches a shell, subprocess, or interpreted-code execution), sandbox effectiveness (verifying kernel-level isolation around each exec primitive), argument-validation review (tracing every input that reaches exec through validation layers), generated-code review-gate verification (confirming agent-generated code does not auto-execute), and composition-policy testing (verifying the host blocks unsafe (untrusted-input + exec) combinations).

Glossary

TermDefinition
Exec PrimitiveAny path inside an AI agent's tool surface that reaches a shell call, subprocess spawn, or interpreted-code execution (Function(), eval, exec()) — including paths the operator did not explicitly model as exec.
Agent-Generated CodeCode (Python, JavaScript, shell, SQL) that an AI agent writes during a task and that the runtime can execute, with or without human review. The principal vector for OWASP ASI05 in agents that have not implemented a generation-to-execution review gate.
Sandbox Escape (Agentic)An attack where code or commands intended to run inside a constrained sandbox (container, seccomp profile, restricted directory) reach execution outside the constraint — exfiltrating credentials, modifying host files, or pivoting to privileged subsystems.

Get funded for your audit

Core grants cover up to $32k. Growth and Builder tiers available. Rolling applications.

No spam. Unsubscribe anytime.