RAG Poisoning

An attack where adversarial content is placed into a retrieval-augmented generation corpus so future queries retrieving keyword-matching documents pull in the attacker's content; the retrieved content carries the same authority as any other retrieved document unless the runtime distinguishes provenance.

RAG Poisoning is an attack where adversarial content is placed into a retrieval-augmented generation (RAG) corpus so that future queries retrieving keyword-matching documents pull in the attacker's content as part of the LLM's context. The retrieved content carries the same authority as any other retrieved document because the LLM does not distinguish between attacker-authored and operator-authored sources unless the runtime explicitly enforces provenance. RAG poisoning is the dominant vector inside OWASP ASI06 (Memory and Context Poisoning).

The attack works because the standard RAG pipeline — embed query, retrieve top-k documents, prepend retrieved content to the prompt — treats every document in the corpus as equally authoritative once it has been ingested. An attacker who can write to the corpus (through document upload, public-ingest endpoints, write-access compromise, or insider access) places adversarial content that will be retrieved on future queries containing the targeted keywords. Every subsequent query that retrieves the poisoned document inherits the attacker's instructions.

Why RAG Poisoning Is Persistent and Hard to Detect

RAG poisoning differs from session-bound indirect prompt injection in two key ways. It is persistent. The poisoned document lives in the corpus until it is detected and removed. Every retrieval that matches its embedding pulls it in for as long as it remains. It scales automatically across users. Every user querying the same RAG corpus is exposed to the same poison without further attacker action.

Detection is hard because the corpus is supposed to contain user-uploaded or externally-ingested content. The poisoned document looks like every other document. Without ingestion-time provenance metadata or runtime adversarial-content scanning, there is no signal that distinguishes the poison from legitimate content.

Defensive Patterns

Effective RAG poisoning defence operates at three layers. Ingestion-time provenance records who authored each document, when, and with what authority. Retrieval-time filtering weights or filters results by provenance, refusing to incorporate content from unverified sources for high-stakes queries. Cross-session corpus audits periodically scan the corpus for adversarial content (instruction-shaped tokens, recent-write spikes, suspicious authority claims) and quarantine matches for human review.

For Web3 deployments specifically, RAG corpora that influence transaction advice (verified contracts, approved tokens, audited DEX routes) should be treated as high-stakes — every retrieval-influenced transaction should require explicit human confirmation regardless of what the corpus advises. For deeper guidance, see the OWASP ASI06 explainer.

Articles Using This Term

Learn more about RAG Poisoning in these articles:

OWASP ASI06 Explained: AI Memory & Context Poisoning

OWASP ASI06 (Memory and Context Poisoning) explained: RAG corruption, vector store attacks, persistent context bias. How to defend AI agent memory layers.

Jun 16, 2026•11 min read

→

Related Terms

Memory Poisoning

An attack where adversaries corrupt entries in an AI agent's persistent memory store (preferences, summaries, learned facts) to bias future reasoning across sessions. The corruption persists until detected, biasing every retrieval that touches the poisoned entries.

Context-Window Saturation

An attack where adversarial content with high relevance and high volume displaces legitimate instructions or system prompts from the agent's finite context window, reducing model adherence and increasing susceptibility to subsequent injection.

Indirect Prompt Injection

Attack class where adversarial instructions are hidden inside external content (READMEs, tool descriptions, RPC responses, social media replies) that an AI agent ingests during normal operation, causing it to execute attacker-chosen actions without the user issuing the command.

Agent Goal Hijack

The threat class OWASP ASI01 covers: any attack that redirects an AI agent's current task or planning objective through adversarial content in the prompt context, regardless of which input channel the content arrives through.

Need expert guidance on RAG Poisoning?

Our team at Zealynx has deep expertise in blockchain security and DeFi protocols. Whether you need an audit or consultation, we're here to help.

Get a Quote