Back to Blog 


AI AuditsAIWeb3 Security
How to Harden an MCP Server Before It Becomes a Master Key to Your Infrastructure
21 min
MCP is being called the "USB-C of AI agents." That analogy is accurate in ways the community hasn't fully internalized yet — including the part where a single universal port gives attackers a single universal entry point.
The Model Context Protocol standardizes how LLMs talk to external tools: databases, APIs, file systems, code execution environments. One interface, any backend. The productivity gains are real. So is the problem: you've now placed a non-deterministic reasoning engine between your users and your most privileged infrastructure, connected by a protocol that most teams are deploying with the security posture of a weekend hackathon project.
This guide breaks down what's actually going wrong in MCP deployments, why traditional API security assumptions don't transfer, and what to implement at each layer — identity, transport, runtime, and observability — to close the gaps.
If you're looking for a quick-reference version, see our MCP security checklist: 24 critical checks for AI agents or the interactive MCP checklist.
What "secure" looks like in a hardened MCP deployment
A hardened MCP deployment enforces these properties simultaneously:
- Every tool invocation is treated as an untrusted transaction requiring cryptographic proof
- Identity is verified at every node in the execution chain, not just at the perimeter
- Untrusted data retrieved by tools is sanitized deterministically before it enters the context window
- Execution environments are immutable, minimal, and sandboxed at the kernel level
- Every action is traceable end-to-end through structured, correlated audit logs
If any one of these is missing, you have a vulnerability. Most production MCP deployments are missing all five.
The problem is architectural, not configurational

Traditional APIs operate within deterministic boundaries. You define endpoints, validate inputs against schemas, and enforce authorization with middleware. The control flow is predictable.
MCP servers sit between privileged infrastructure and a non-deterministic reasoning engine. The LLM decides which tools to call, what parameters to pass, and how to interpret the results. When it acts through an MCP server, it inherits delegated user permissions — often with far broader scope than any single API call would grant.
The architecture has three components:
- Host: The application where the AI runs (IDE, chat interface, automation platform)
- Client: Parses requests, manages context, communicates with the server
- Server: Provides tools, resources, and prompt templates the AI can invoke
Interactions flow continuously across these trust boundaries. A security model that only validates at the perimeter — or that assumes the LLM will "follow instructions" — fails by design.
How the ecosystem is actually failing
The credentials crisis

A 2025 analysis by Astrix Security evaluated over 5,200 open-source MCP server implementations. The numbers are bad:
- 88% required credentials to function
- 53%+ relied entirely on static, long-lived API keys or Personal Access Tokens stored as plaintext environment variables
- Only 8.5% used OAuth 2.0
This is early-cloud-era security practice applied to an attack surface that's orders of magnitude larger.
Real vulnerabilities, not hypotheticals
NeighborJack: Hundreds of MCP servers bound to
0.0.0.0 by default. Any device on the same local network could connect and execute tools with zero authentication.mcp-server-git CVEs (January 2026): Three CVEs in Anthropic's own reference implementation — path traversal, arbitrary file deletion, and chained RCE when paired with a filesystem server. Because teams clone reference architectures as templates, these vulnerabilities propagated across the ecosystem immediately.
Asana cross-tenant contamination (May 2025): A tenant isolation flaw affected ~1,000 enterprise customers. Multi-tenant MCP deployments with shared servers introduced internet-facing attack surfaces that traditional isolation couldn't contain.
AI Engine WordPress plugin (June 2025): Privilege escalation via insecure MCP tool configurations let low-privileged users execute admin functions. Over 100,000 sites affected.
Supabase MCP + Cursor: Prompt injection delivered through malicious support ticket data exploited overprivileged tools in the IDE, exposing private database tables to external attackers.
Lock down identity and authorization
Prevent the confused deputy attack
The interaction between MCP clients, proxy servers, and third-party APIs creates a delegated trust chain that's vulnerable to the Confused Deputy problem. Here's the specific failure mode:
- An MCP proxy connects to a third-party API using a fixed OAuth 2.0 client ID
- MCP clients dynamically register and receive their own client IDs
- The third-party auth server uses a consent cookie after initial user authorization
- An attacker leverages dynamic client registration + existing consent cookies to silently obtain authorization codes without user consent
The proxy acts on behalf of the attacker using its elevated privileges.
Required mitigations:
- Implement a per-client consent flow before initiating the third-party OAuth flow
- Maintain a local registry of approved
client_idvalues per user; verify all requests against it - The consent UI must explicitly identify the requesting client by name, display requested scopes, and implement CSRF protection
- Prevent iframing with
frame-ancestorsCSP directives orX-Frame-Options: DENY - Prefix consent cookies with
__Host-and setSecure,HttpOnly, andSameSite=Lax - Validate
redirect_uriwith strict string matching (no pattern or wildcard matching) - Generate a cryptographically secure, single-use
stateparameter with a short TTL (≤10 minutes) for every OAuth request
Kill token passthrough
Token passthrough — where the MCP server accepts auth tokens from a client and uses them directly against downstream APIs — is explicitly forbidden in the MCP security spec, but remains common in practice.
Why it's dangerous:
- Bypasses application-level rate limiting and request validation
- Breaks monitoring that depends on accurate token audience claims
- Fractures the audit trail — you can't distinguish server-initiated actions from client-forged requests
The fix: MCP servers must authenticate with downstream services using their own cryptographically scoped credentials, independently of client tokens.
Prevent session hijacking
If you use persistent session IDs for client-server communication and an attacker intercepts one, they bypass initial authentication entirely.
Implementation requirements:
- Avoid stateful sessions for authentication when possible
- When session IDs are necessary, generate them with a CSPRNG (UUIDv4)
- Rotate frequently
- Format session keys as
<user_id>:<session_id>so a hijacked session can't impersonate a different user
Replace static keys with cryptographic workload identity
Stop using raw API keys. Adopt cryptographic workload identity standards:
- SPIFFE/SPIRE: Short-lived, automatically rotated, cryptographically verifiable identities for microservices and MCP servers
- Token Exchange (RFC 8693): Swap user OAuth tokens for properly scoped delegation tokens instead of passing user tokens directly
- DPoP (RFC 9449): Bind access tokens to a specific client's public key, preventing replay after interception
- Rich Authorization Requests (RFC 9396): Define granular access rules with centralized policy engines (Open Policy Agent, Amazon Cedar, OpenFGA) decoupled from server code
Defend against indirect prompt injection
Understand the attack model

Direct prompt injection ("ignore previous instructions") targets the model via the chat interface. Indirect Prompt Injection (XPIA) is more dangerous: malicious instructions embedded in untrusted data that the agent retrieves through legitimate tool calls.
The core problem: LLMs process tool results as a flat text stream. They cannot architecturally distinguish between data context and embedded commands. To the model, a hidden instruction in a fetched email body is indistinguishable from a system message.
Tool Poisoning is the variant where attackers embed malicious instructions in MCP tool metadata — tool names and descriptions — manipulating which tools the model selects and how it formats parameters. For a deeper dive into how LLMs process adversarial inputs, see our cognitive foundations of LLM security analysis.
Real XPIA vectors across MCP integrations
| MCP Integration | Injection mechanism | Consequence |
|---|---|---|
| Gmail | Hidden HTML (<div style="display:none">) in email body | Silent exfiltration of 30 days of inbox |
| Salesforce | Manipulated text in public CRM fields | Agent creates zero-value deals, bypasses billing |
| GitHub | Invisible HTML comments in PR descriptions | Unreviewed code merged to main, CI skipped |
| Slack | Messages from compromised accounts in shared channels | Exfiltration of #finance and #deals history |
| Zendesk | Payloads in inbound support tickets | Agent leaks list of active user accounts |
| Google Drive | 1pt white text in shared documents | Disclosure of proprietary pricing from other contracts |
| Web search | Instructions in HTML comments on competitor pages | Agent generates competitive analysis with fabricated legal claims |
| Gong | Instructions spoken aloud during recorded calls | Agent forwards internal account history to external prospect |
| Jira | Instructions in "steps to reproduce" field | Agent posts production credentials to public tracker |
| Notion | Hidden instructions in collapsed toggle blocks | Agent commits company to unreleased product features |
Build deterministic XPIA defenses
System prompts telling the LLM to "ignore external instructions" don't work. The model processes the protective prompt and the malicious payload through the same neural pathways. There's no privilege boundary. Attackers use semantic tricks, obfuscation, and multilingual commands to bypass prompt-based guardrails trivially.
Defense must happen at the data layer, before untrusted content reaches the context window. This is a core principle of defense in depth.
Tier 1 — Synchronous regex-based sanitization (<1ms):
- Apply Unicode normalization to prevent homoglyph attacks (e.g., visually similar Cyrillic characters to ASCII)
- Strip injected role markers (
SYSTEM,ASSISTANT,<|im_start|>) - Neutralize payloads hidden in Base64 or URL-encoded formats
Tier 2 — ML-based sentence-level classification (~10ms):
Regex can't catch novel, creatively rephrased attacks. A lightweight MLP classifier handles this tier. The StackOne Defender framework, for example, uses a 22 MB ONNX model based on MiniLM-L6-v2 that runs on standard CPUs.
The critical design choice is sentence-level classification. A poisoned response often buries a single malicious sentence inside hundreds of words of legitimate content. Classifying the aggregate dilutes the signal. Split the text into sentences, score each from 0.0 (safe) to 1.0 (injection), and quarantine the entire response if any sentence exceeds the threshold. This approach achieves a 90.8% F1 score, outperforming larger models like DistilBERT and Meta Prompt Guard v1.
Tool-aware risk scoring: Automatically assign risk profiles based on data source trust. Tools interacting with unauthenticated external data (
gmail_*, email_*) get "Very High Risk" profiles with lower classification thresholds. Internal database queries get lower risk profiles.Boundary annotations: Wrap sanitized tool results in cryptographic boundary tags (e.g.,
<tool_output>...</tool_output>) and instruct the system prompt to treat content within those boundaries as inert data only.Harden the network layer
Enforce mutual TLS
Standard TLS only authenticates the server. For MCP, that's insufficient — you need the server to verify the client's identity before processing any JSON-RPC payload. mTLS is mandatory for all remote deployments.
Implementation checklist:
- Protect certificate files and private keys with
chmod 600, readable only by the server process - Never commit certificates to version control
- Use TLS 1.3 with forward-secret cipher suites
- Enforce hostname validation with proper Subject Alternative Names (SANs)
- Automate certificate rotation with short validity periods (days/weeks) using cert-manager, ACME, or internal PKI
- Enable continuous revocation checks via CRL or OCSP
Isolate the network

Are you audit-ready?
Download the free Pre-Audit Readiness Checklist used by 30+ protocols preparing for their first audit.
No spam. Unsubscribe anytime.
Binding: Never bind to
0.0.0.0. For local-only servers, use stdio transport (restricts access to the parent MCP client process) or bind to 127.0.0.1 via Unix Domain Sockets. For remote deployments, use Streamable HTTP behind strict network controls.| Configuration | Requirement | Rationale |
|---|---|---|
| Default policy | Set INPUT, FORWARD, OUTPUT chains to DROP | Default-deny; only explicitly allowed traffic passes |
| Stateful tracking | Allow conntrack --ctstate RELATED,ESTABLISHED | Don't block return traffic for legitimate outbound requests |
| Traffic redirection | NAT redirect HTTP/HTTPS (80/443) to API gateway port (e.g., 8080) | Force all traffic through gateway for rate limiting and payload inspection |
| Egress / SSRF prevention | Block outbound to 169.254.169.254; allowlist only required downstream domains | Prevent SSRF attacks extracting IAM credentials or traversing internal networks |
| Segmentation | Place MCP servers in private subnets unreachable from public internet | External clients route through authenticated ingress controllers |
Disable all debug, admin, and status endpoints in production. They leak diagnostic information.
Constrain the runtime environment
Containerize properly
Running an MCP server as a raw script on a host machine is a critical security failure. But a basic Docker container with a full Ubuntu image is false security.
Use hardened minimal images: Alpine Linux or Google's distroless images. Distroless images lack package managers, debugging tools, and shells (
bash, sh). Even if an attacker triggers RCE in the MCP code, they can't establish persistence, download payloads, or escalate privileges without these utilities.Additional container requirements:
- Run exclusively as non-root
- Enforce global read-only filesystems; use ephemeral memory-backed volumes only for temporary processing
- Set CPU and memory quotas at the orchestration layer to prevent infinite loops, excessive API calls, or DoS from a malfunctioning agent
Manage the tool budget
Exposing hundreds of tools to an agent degrades reasoning performance, bloats the context window, and inflates the attack surface.
Don't map every downstream API endpoint to a separate MCP tool. That 1:1 mapping is an anti-pattern. Design tools around consolidated use cases and use MCP "prompts" as macros to guide LLM behavior. Expose only what the agent's defined role strictly requires.
Add kernel-level sandboxing
Augment containerization with:
- AppArmor / Seccomp / SELinux: Custom restrictive profiles denying unexpected syscalls (kernel parameter modification, raw network sockets)
- Kata Containers / gVisor: For high-security environments (financial transactions, PII processing, LLM-generated code execution) — lightweight VMs that intercept syscalls in userspace, preventing kernel exploits from reaching the host
- Trusted Execution Environments with remote attestation: For mathematically guaranteeing code and memory integrity before data processing
Lifecycle enforcement: Forcefully terminate all MCP background processes and temporary execution environments when the client session closes. No persistent footholds.
Build observability from day one
Structure your audit logs
Standard web logs are inadequate for MCP. When something goes wrong, you need to reconstruct the LLM's entire decision chain: what it reasoned, which tools it called, what parameters it passed, what data it retrieved, and what it did with the results.
Every interaction must include:
- ISO 8601 timestamps
- Correlation/Trace ID (UUIDv4) persisting across client request, gateway routing, LLM reasoning, tool invocation, and response delivery
- Server ID, User ID, Team ID for attribution
- Method called, execution duration, response size, outcome/error state
- Security events: quarantined prompt counts, policy violations, unauthorized access attempts, context payload anomalies
Use OpenTelemetry for distributed tracing across multi-tool, multi-server chains. If you're building an incident response plan around MCP, structured logs are not optional — they're your primary forensic data source.
Sanitize log output
Logs must never contain credentials, API keys, environment variables, or PII processed by the LLM. Implement masking logic at the application layer to redact authorization headers, token payloads, and sensitive prompt data before writing to disk or transmitting to SIEM.
Use async logging (e.g., Pino), batching, and buffering so the observability stack doesn't become a latency bottleneck or DoS vector.
Centralize through an MCP gateway
A dedicated gateway provides a unified control plane for monitoring LLM token generation, agent routing, and tool usage patterns across the organization. It maintains server inventories tracking lifecycle data, approval status, health metrics, and quarantined prompt counts per server.
Secure the supply chain
Verify everything cryptographically
All deployed MCP servers require mandatory code signing and integrity verification before execution. Use Sigstore to sign container images and build provenances automatically.
Maintain an AI Bill of Materials (AIBOM) tracking the lineage of all models, datasets, and MCP dependencies. Run private, vetted internal registries of approved servers — don't let developers install directly from public repositories. For more on how supply chain attacks propagate in practice, see our glossary.
Run protocol-specific vulnerability scanners
Standard SAST tools miss the semantic risks in LLM tool definitions and prompt templates. Integrate specialized scanners into your CI/CD pipeline:
- MCPSafetyScanner (AI Assurance Lab): Multi-agent tool that simulates prompt injection attacks against your server's tool registry, cross-references findings against security knowledge bases, and generates remediation reports
- Nova Proximity: Deep parameter analysis detecting injection vectors, jailbreak patterns, and suspicious code. Supports latest MCP specs (2025-11-25), analyzes Streamable HTTP transport and session management
- MCP-Scan (Invariant): Detects tool poisoning and "MCP Rug Pulls" — where an initially benign tool definition is maliciously altered post-approval
- Enkrypt MCPScan: Holistic analysis covering IDOR, DoS, least-privilege audits, timeout settings, and MCP-specific network security
- Dockyard / MCP Sentinel: GitHub Actions that build, scan, sign (Sigstore), and publish container images for MCP servers
Run these after every significant code change and continuously against production configurations. If you want a professional assessment of your MCP deployment, Zealynx offers a dedicated MCP security audit service.
Watch for domain-specific amplification
The blast radius of an MCP vulnerability depends heavily on what it's connected to.
CRM / Marketing automation: A compromised agent operating nurture campaigns through Salesforce, HubSpot, or Klaviyo can read customer communications, extract competitive intelligence, and send unauthorized external emails. Use tightly scoped service principals (e.g.,
DATABRICKS_SERVICE_PRINCIPAL_ID, not a global token). Enforce human-in-the-loop authorization for all state-changing actions: the AI drafts, a human approves.Development environments / UI generation: MCP servers that manipulate codebases (e.g., Unity Editor integration) or generate interactive UI components (e.g., Shopify MCP UI) grant access to proprietary source code and create XSS/phishing surfaces through AI-generated interface elements.
Cryptocurrency / DeFi: The highest-stakes context. Private key or mnemonic exposure through an MCP-connected wallet tool is unrecoverable. Watch for "Multi-MCP Function Priority Hijacking" (malicious plugin hijacks function execution priority) and "Cross-MCP Triggering" (malicious server returns prompts designed to trigger operations in other enabled plugins). Use Scrypt for key protection at rest. Enforce local, offline LLM execution — third-party model providers must never access wallet data or transaction signatures. For DeFi teams, our AI red teaming guide covers how to stress-test these scenarios systematically.
What to do next

Pick the layer where your deployment is weakest and start there:
- Audit your credentials. If you're running static API keys in environment variables, migrate to SPIFFE/SPIRE or at minimum implement OAuth 2.0 with short-lived tokens and token exchange (RFC 8693).
- Scan your servers. Run MCP-Scan or MCPSafetyScanner against your current tool definitions. You'll likely find tool poisoning vectors you didn't know existed.
- Containerize properly. Switch to distroless base images, enforce non-root execution, and add Seccomp/AppArmor profiles.
- Implement XPIA sanitization. Deploy a two-tier defense pipeline (regex + ML classifier) on all tool outputs before they enter the LLM context window.
- Instrument everything. Add correlation IDs and structured JSON logging to every tool invocation. If you can't reconstruct what happened after an incident, you can't fix it.
None of these are optional. MCP collapses the distance between "the AI can read my email" and "the AI just forwarded my entire inbox to an external address" to a single indirect prompt injection in an inbound message. The protocol is powerful precisely because it's universal — and it's dangerous for exactly the same reason.
If you need help threat modeling your MCP deployment or want a professional AI security audit, we're here to help.
Get in touch
At Zealynx, we specialize in AI security audits and MCP security assessments. Whether you're deploying your first MCP server or hardening an existing fleet, our team can identify the gaps that automated scanners miss — from confused deputy vulnerabilities to indirect prompt injection in your tool chain.
FAQ: MCP server hardening
1. What is the Model Context Protocol (MCP) and why does it matter for security?
The Model Context Protocol is an open standard that defines how AI agents (LLMs) communicate with external tools and services — databases, APIs, file systems, and code execution environments. It matters for security because it creates a single, standardized interface between a non-deterministic reasoning engine and your most privileged infrastructure. Unlike traditional APIs where control flow is predictable, MCP lets the AI decide which tools to call, what parameters to pass, and how to interpret results — all with delegated user permissions. A vulnerability in one MCP server can cascade across every connected system. For a foundational overview, see our Model Context Protocol glossary entry.
2. What is indirect prompt injection and how does it differ from direct prompt injection?
Direct prompt injection targets the AI model through its chat interface — a user types "ignore previous instructions" directly. Indirect prompt injection (XPIA) is far more dangerous: an attacker embeds malicious instructions inside data that the AI retrieves through legitimate tool calls. For example, hidden text in a Google Doc, an invisible HTML comment in a GitHub PR, or a crafted support ticket in Zendesk. The AI cannot architecturally distinguish between retrieved data and commands, so it follows the hidden instructions as if they were system messages. Defenses must operate at the data layer (regex sanitization + ML classification) before content reaches the context window, not through prompt-based guardrails.
3. What is the confused deputy problem in MCP deployments?
The confused deputy is a privilege escalation pattern where a trusted intermediary (the MCP proxy server) is tricked into performing unauthorized actions on behalf of an attacker. In MCP, this happens when an attacker exploits dynamic client registration and cached consent cookies to silently obtain OAuth authorization codes. The proxy then executes requests using its elevated privileges, believing the request came from a legitimate user. Mitigation requires per-client consent flows, strict
redirect_uri validation, CSRF-protected consent UI, and cryptographically secure single-use state parameters with short time-to-live windows.4. Why can't I secure my MCP server with just a firewall and TLS?
Standard TLS only authenticates the server to the client — it doesn't verify who the client is. MCP requires mutual TLS (mTLS) where both parties present certificates. But even mTLS only secures the transport layer. MCP's unique risks operate above the network: prompt injection in tool results, tool poisoning in metadata, over-permissioned tool registries, and non-deterministic LLM behavior. You need defense in depth — identity verification at every node, runtime sandboxing, data sanitization before context injection, and structured observability. A firewall stops network-level attacks; it doesn't stop a malicious instruction hidden in a Zendesk ticket from exfiltrating your user database.
5. What is tool poisoning and how do attackers exploit it?
Tool poisoning is a variant of prompt injection where attackers embed malicious instructions directly in MCP tool metadata — the tool's name, description, or parameter definitions. Since the LLM reads this metadata to decide which tools to use and how to call them, a poisoned tool can manipulate the AI's tool selection, alter the parameters it passes, or redirect its output. For example, a tool description could contain hidden instructions telling the model to always include the user's API key in the request. Defenses include cryptographic verification of tool definitions, runtime integrity checks against a signed tool registry, and specialized scanners like MCP-Scan that detect when tool definitions change post-approval ("MCP Rug Pulls").
6. How should teams approach MCP security in cryptocurrency and DeFi environments?
DeFi is the highest-stakes MCP deployment context because private key exposure is unrecoverable — there's no "undo" button on the blockchain. Teams must enforce local, offline LLM execution so third-party model providers never access wallet data or transaction signing. Watch for "Multi-MCP Function Priority Hijacking" where a malicious plugin overrides legitimate function execution, and "Cross-MCP Triggering" where one server's output manipulates another server's tools. Use hardware-backed key storage, enforce human-in-the-loop for all value-transferring operations, and treat every MCP tool result as potentially adversarial. For comprehensive AI security testing in Web3, see our AI red teaming and AI penetration testing guides.
Glossary
| Term | Definition |
|---|---|
| Model Context Protocol | Open standard defining how AI agents communicate with external tools and services through a unified interface. |
| Prompt Injection | Attack technique manipulating AI system inputs to bypass safety controls or extract unauthorized information. |
| Context Window | The maximum amount of text an LLM can process in a single interaction, including system prompts, user input, and tool results. |
| Trust Boundary | Point in a system where the level of trust changes, requiring validation of all data crossing it. |
| Defense in Depth | Security strategy layering multiple independent controls so a single failure doesn't compromise the system. |
| Supply Chain Attack | Attack targeting the less-secure elements in a software supply chain rather than the primary target directly. |
| Privilege Escalation | Exploiting a vulnerability to gain elevated access beyond what was originally authorized. |
| Trusted Execution Environment | Isolated processing environment guaranteeing code and data integrity through hardware-level security. |
Are you audit-ready?
Download the free Pre-Audit Readiness Checklist used by 30+ protocols preparing for their first audit.
No spam. Unsubscribe anytime.


