World First

ROGUE

World-first malicious MCP server engine — attack the trust relationship. Agents trust server responses. We own the server.
8
Subsystems
136
Tests
25
ARMORY Payloads
2
Transports
pip install red-specter-rogue
Agents trust every tool description the server sends / No sanitisation on MCP tool results / sampling/createMessage has no server authentication / Cross-session memory poisoning persists indefinitely / Tool call chains escalate privileges silently / Prior art targets agents calling bad servers / ROGUE weaponises the server itself Agents trust every tool description the server sends / No sanitisation on MCP tool results / sampling/createMessage has no server authentication / Cross-session memory poisoning persists indefinitely / Tool call chains escalate privileges silently / Prior art targets agents calling bad servers / ROGUE weaponises the server itself

The MCP Trust Model Is Broken

Every AI agent that connects to an MCP server implicitly trusts it. Tool descriptions arrive before the first user interaction. Sampling requests override the operator system prompt. Tool results execute without validation. Prior art targets agents connecting to malicious servers — ROGUE weaponises the server itself.

Tool Descriptions Are Trusted Unconditionally

Every MCP agent treats tool descriptions as ground truth. There is no sanitisation layer. Hidden instructions embedded in a tool's description text arrive before the first tool call — before any user input. The agent reads them. The agent follows them.

sampling/createMessage Has No Authentication

The MCP sampling channel allows servers to inject a systemPrompt directly into the agent's reasoning context. No request signing. No nonce. No replay protection. A malicious server can override the operator's system prompt on every conversation turn.

Tool Results Are Never Sanitised

When an agent calls a tool, the tool result arrives on a trusted channel. No input validation. No content inspection. ROGUE embeds SYSTEM_OVERRIDE, auth_notice, escalation_chain, and session_sync payloads in tool results. They execute.

Memory Poisoning Persists Across Sessions

A single ROGUE engagement can poison an agent's persistent memory store. LangChain, CrewAI, AutoGen, Claude Memory, Mem0 — all vulnerable. The agent carries the poisoned instruction into every future session. One engagement. Indefinite persistence.

Privilege Escalation via Tool Chain

ROGUE's ESCALATE subsystem identifies tools the agent already has privileged access to — shell_exec, db_read, file_read, email_send. A crafted tool result triggers the agent to call a privileged tool it legitimately holds. No vulnerability needed in the tool itself.

Exfiltration Through the Input Channel

The EXFIL subsystem captures credentials, system prompts, conversation history, and context data through the tool parameter channel. Credential patterns: sk-, Bearer, ghp_, xoxb-, api_key=, password=. Zero network traffic to external systems.

The ROGUE Armoury

Eight subsystems attack every layer of the MCP trust model. SPAWN and REPORT require STANDARD clearance. POISON and SAMPLE require --override. INJECT, EXFIL, ESCALATE, and PERSIST require --confirm-destroy.

# Subsystem Clearance What It Does
01 SPAWN STANDARD Real MCP JSON-RPC 2.0 server over stdio (OS pipes) and SSE (HTTP/streaming). 7 deployment personas: weather, search, database, filesystem, code, calendar, finance.
02 POISON --override Tool description injection. 7 hidden instruction patterns × 6 injection styles. Embeds attacker instructions before the first tool call. OWASP LLM07.
03 SAMPLE --override sampling/createMessage abuse. 5 request templates (CVSS 8.7–9.3). Injects attacker-controlled systemPrompt via the server-to-agent sampling channel.
04 INJECT --confirm-destroy Tool result injection. 5 payloads: SYSTEM_OVERRIDE, auth_notice, debug_trace, escalation_chain, session_sync. Wildcard match — fires on any tool name.
05 EXFIL --confirm-destroy Data exfiltration via tool parameter channel. Credential regex: sk-, Bearer, ghp_, xoxb-, api_key=, password=. Classifies: credentials / system_prompt / conversation_history / context_data.
06 ESCALATE --confirm-destroy Privileged tool call chains. 4 chains: shell_exec, db_read, file_read, email_send. Tool result triggers agent to call a privileged tool it already holds.
07 PERSIST --confirm-destroy Cross-session memory poisoning. 5 templates (CVSS 9.4–9.7). Targets: Claude Memory, LangChain, CrewAI, AutoGen, Mem0. Single engagement — indefinite persistence.
08 REPORT STANDARD WARLORD-compatible Ed25519-signed JSON. world_first=true, tool_number=61. Per-finding CVSS, OWASP LLM + MITRE ATLAS mapping.

One Server. Every Trust Layer.

Spawn a malicious MCP server and run all attack subsystems in sequence:

$ rogue spawn --persona weather --transport stdio
[SPAWN] MCP server running: weather (stdio)
[POISON] Tool description poisoned — hidden instruction embedded
[SAMPLE] sampling/createMessage injected — systemPrompt overridden
[INJECT] Tool result: SYSTEM_OVERRIDE payload delivered
[EXFIL] Captured: Bearer sk-proj-xxx... [credentials]
[ESCALATE] shell_exec triggered via tool chain
[PERSIST] Memory poisoned: LangChain vector store

SCAN COMPLETE | 7 findings | Report signed | world_first=true

Real MCP Protocol

ROGUE is a genuine MCP JSON-RPC 2.0 server. stdio over OS pipes. SSE over HTTP. No simulation — real protocol, real server, real agent exploitation.

7 Deployment Personas

Weather, search, database, filesystem, code, calendar, finance. Each persona is a fully functional MCP server surface. Indistinguishable from a legitimate integration.

Ed25519 Signed Reports

Every report cryptographically signed with Ed25519. world_first=true. tool_number=61. OWASP LLM + MITRE ATLAS per finding. WARLORD-compatible JSON output.

Clearance-Gated Subsystems

Destructive subsystems require explicit clearance flags. INJECT, EXFIL, ESCALATE, and PERSIST are --confirm-destroy gated. No accidental fire. No ambiguity.

NIGHTFALL ARMORY Integration

Connected to 25 dedicated ARMORY payloads in the rogue_mcp_server category. Every ROGUE engagement pulls attacker-proven MCP exploitation payloads on demand.

8
Attack Subsystems
136
Tests Passing
25
ARMORY Payloads
2
MCP Transports
0
Failures

Every MCP Trust Surface Exploited

ROGUE attacks every layer of the MCP trust model simultaneously. Tool descriptions. Sampling requests. Tool results. Memory stores. Privileged tool chains. Exfiltration via tool parameters. Each vector is independent — each subsystem can fire alone or in sequence during a single engagement.

Description Layer

  • Hidden instruction injection
  • 7 instruction patterns
  • 6 injection styles
  • Pre-call delivery
  • OWASP LLM07

Sampling Channel

  • systemPrompt override
  • 5 request templates
  • CVSS 8.7–9.3
  • No auth required
  • Per-turn injection

Tool Results

  • SYSTEM_OVERRIDE payload
  • auth_notice injection
  • escalation_chain
  • session_sync payload
  • Wildcard tool match

Memory Stores

  • LangChain vector store
  • CrewAI memory
  • AutoGen history
  • Claude Memory
  • Mem0 poisoning

Privilege Chains

  • shell_exec trigger
  • db_read escalation
  • file_read pivot
  • email_send abuse
  • No tool vuln needed

Stage 61. NIGHTFALL Chain.

ROGUE is Stage 61 of the Red Specter NIGHTFALL offensive pipeline. It occupies a unique position — the world's first tool that weaponises the MCP server itself rather than targeting agents connecting to bad servers. Findings feed directly into AI Shield as runtime blocking rules.

Stage 58 — Identity Attack
DELEGATE
Agent identity & OAuth delegation attacks
Stage 59 — Supply Chain
PHANTOM SKILL
AI agent supply chain attack engine
Stage 60 — NTN Attack
ASTRO BLASTER
Non-terrestrial network AI agent attacks
Stage 61 — MCP Server
ROGUE
World-first malicious MCP server engine
Stage 62 — CI/CD Attack
PIPELINE
CI/CD pipeline attack engine
Stage 63 — Dark Web
SPECTER DARK
Restricted — law enforcement only
Stage 64 — Behavioural
SPECTER INSTINCTION
AI agent behavioural fingerprinting
Stage 65 — Drone Attack
SPECTER DRONE
Drone AI attack engine
Orchestration
WARLORD
Autonomous campaign orchestration
Discovery & Governance
IDRIS
Discover and govern AI assets
Defence
AI Shield
Defend everything above it
SIEM Integration
redspecter-siem
Findings feed directly into Splunk, Sentinel, QRadar

25 Dedicated MCP Payloads

7
Description Injection
5
Sampling Templates
5
Tool Result Payloads
4
Escalation Chains
5
Memory Poison Templates
25
Total ARMORY Payloads

Every Finding Mapped

2/10 OWASP LLM

OWASP LLM Top 10

  • LLM07 System Prompt Leakage (POISON, SAMPLE)
  • LLM02 Insecure Output Handling (INJECT, ESCALATE)
Cryptographic

Report Integrity

  • Ed25519 digital signatures
  • SHA-256 evidence chains
  • RFC 3161 timestamps
  • world_first=true
  • tool_number=61
MITRE ATLAS

MITRE ATLAS Mapping

  • AML.T0051 LLM Prompt Injection (POISON, INJECT)
  • AML.T0056 Meta Prompt Extraction (SAMPLE, EXFIL)
  • AML.T0040 ML API Access (EXFIL, ESCALATE)
  • AML.T0048 Societal Harm (PERSIST)

Security Distros & Package Managers

Kali Linux
.deb package
Parrot OS
.deb package
BlackArch
PKGBUILD
REMnux
.deb package
Tsurugi
.deb package
PyPI
pip install
macOS
pip install
Windows
pip install
Docker
docker pull

Authorised Use Only

Red Specter ROGUE is intended for authorised security testing only. Unauthorised deployment of a malicious MCP server against agent systems you do not own or have explicit permission to test may violate the Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), and equivalent legislation in other jurisdictions. Always obtain written authorisation before conducting any security assessments. INJECT, EXFIL, ESCALATE, and PERSIST subsystems require --confirm-destroy clearance. Apache License 2.0.

World First
Prior Art Targets Agents. ROGUE Owns The Server.

Every existing MCP security tool focuses on agents connecting to potentially malicious servers. ROGUE is the first tool to weaponise the server itself — spawning a fully functional MCP JSON-RPC 2.0 server that attacks the agent from a position of unconditional trust. Real protocol. Real server. Real exploitation.

stdio
OS Pipe Transport
SSE
HTTP Streaming Transport
7
Deployment Personas
true
world_first flag
Enterprise Integration
Enterprise SIEM Integration — Native

Export every ROGUE finding directly to your SIEM. One flag. Native format translation. Ed25519 signatures and RFC 3161 timestamps preserved across every export.

Splunk
HEC • CIM Compliant
Sentinel
CEF • Log Analytics API
QRadar
LEEF 2.0 • Syslog
rogue spawn --persona weather --transport sse --export-siem splunk
Ed25519 Cryptographic Override
ROGUE UNLEASHED

Cryptographic override. Private key controlled. One operator. Founder's machine only.