Auditors should treat manifests and skills like policy-bearing code. The question is not what the description promises. The question is what the runtime can actually reach.
HighPublished Wed May 13 2026 00:00:00 GMT+0000 (Coordinated Universal Time)
Tool or Manifest Capability Overclaim
A tool, plugin, skill, or MCP manifest overstates its safety or understates the authority it actually grants to the agent runtime.
Primary threat classes
- • Skill, Plugin, and Integration Backdoors
- • Tool Misuse
Affected systems
- • MCP deployments
- • Coding agents
- • Long-lived agents
Root cause
- • The platform relies on descriptive metadata or prose policy rather than verified behavioral constraints and runtime attenuation.
Exploit path
- • A tool is installed or enabled because its descriptor appears narrow or benign
- • The agent infers trust from the description
- • Actual runtime behavior exposes broader file, network, or secret access
- • The agent uses the tool under false assumptions about scope
What an auditor should check
- • Review manifests, skill files, tool cards, and installation docs as authority artifacts
- • Compare declared capabilities with actual file, network, and secret access
- • Check whether the platform enforces attenuation or only documents it
Evidence to collect
- • Manifest or skill text
- • Runtime permissions and reachable sinks
- • Examples of undocumented side effects or broader scope
Remediation guidance
- • Treat descriptors as untrusted claims until verified
- • Enforce runtime-scoped permissions independent of prose
- • Apply provenance review to plugins, skills, and connector packages
Agentic DeFi relevance
- • A mis-scoped wallet, exchange, or treasury connector can give an AI system more market or fund authority than operators believe they granted.