Memory injection
An attack where a malicious instruction is written into an AI agent's persistent memory store, causing it to survive across sessions and execute later as if it were the agent's own trusted context.
Memory injection is an attack technique targeting AI agents that maintain a persistent memory store across sessions. Rather than injecting a malicious instruction into the agent's current input — which is prompt injection — memory injection plants the instruction into the agent's stored memory, where it is later retrieved and treated as the agent's own prior knowledge rather than as an external command.
How it works
Modern AI agents persist context across sessions using vector databases, key-value stores, or structured conversation histories. This memory is fetched at the start of each new session or task and incorporated into the agent's context window alongside system instructions. Because the agent treats its own memory as trusted, pre-authenticated context, it does not subject retrieved memory to the same scrutiny applied to live user inputs.
The attack sequence:
- Plant: The attacker causes a malicious instruction to be written to the agent's memory store. This can happen through a direct message the agent stores, a tool result the agent caches, or any output from an external source the agent records.
- Survive: The instruction persists across session boundaries. The original attacker interaction may be long over.
- Trigger: On a subsequent session — potentially on a different platform — the agent retrieves the planted instruction as part of its context and executes it as if it were a legitimate prior directive.
Why it is more dangerous than prompt injection
Prompt injection attacks are bounded by session scope: the malicious instruction must be present in the current input and can be filtered at the input layer. Memory injection defeats this defense because:
- Persistence: The attack survives session resets, including restarts and re-deployments that clear runtime state
- Deferred execution: The instruction fires at a time and platform of the attacker's choosing, not at the moment of injection
- Cross-platform propagation: An instruction planted via a Discord message can trigger during an Ethereum transaction, a pattern demonstrated on live ElizaOS deployments in the Real AI Agents with Fake Memories research (2025)
- Bypass of input filters: Input sanitization checks the current turn's messages, not the stored memory the agent loads at session start
Research from 2025 (CrAIBench, 500+ test cases across 150+ blockchain tasks) found that LLM agents are significantly more vulnerable to memory injection than to prompt injection, and that prompt-based defenses — including well-crafted system prompts — are fundamentally inadequate because injected memory bypasses them entirely.
Impact in agentic DeFi
In the context of agentic DeFi, where AI agents hold signing authority over wallets or control session keys for treasury management and trading, memory injection represents a path to unauthorized transaction execution that does not require exploiting any smart contract code. The agent faithfully executes the transaction it was instructed to execute — the corruption happened earlier, at the memory layer, and the blockchain records a valid, signed transaction.
The Grok and Bankr drain of May 2026 combined privilege escalation via an inbound NFT with an encoded instruction bypass — a combination attack on multiple agent surfaces in sequence.
Defense
- Memory authentication: Cryptographically sign stored memory entries so the agent can detect tampering before loading them
- Memory segmentation: Isolate memory written during high-privilege sessions from memory written during untrusted-input sessions
- Scope-limited recall: Restrict which memory categories are loaded for which task types — a trading agent should not load memory from social media ingestion sessions
- Anomaly detection on retrieval: Flag memory entries that contain instruction-like patterns and quarantine them for human review before execution
- Architectural separation: Where possible, keep the instruction-generating LLM separate from the execution layer; the execution layer should not retrieve and act on natural-language memory without deterministic validation
Related reading
Articles Using This Term
Learn more about Memory injection in these articles:
Related Terms
Prompt Injection
Attack technique manipulating AI system inputs to bypass safety controls or extract unauthorized information.
Indirect Prompt Injection
Attack class where adversarial instructions are hidden inside external content (READMEs, tool descriptions, RPC responses, social media replies) that an AI agent ingests during normal operation, causing it to execute attacker-chosen actions without the user issuing the command.
AI Agent
Autonomous software system powered by a large language model that can perceive, reason, and execute actions — including signing blockchain transactions — without continuous human oversight.
Agentic AI
AI systems that autonomously take actions in the real world, including executing commands, managing files, and interacting with external services.
Off-Chain Injection
Pattern where compliance servers sign permission certificates off-chain that users pass into smart contracts for on-chain validation.
Need expert guidance on Memory injection?
Our team at Zealynx has deep expertise in blockchain security and DeFi protocols. Whether you need an audit or consultation, we're here to help.
Get a Quote