Red Specter SPECTER WORM — Self-Replicating AI Worm Engine

AI agents share memory / One infected agent infects the network / MCP tool responses carry worm payloads / A2A messages propagate cross-agent / RAG documents embed self-replicating prompts / Morris II demonstrated self-replication in 2024 / Kill-switch payloads survive context resets / No organisation has tested their AI agent ecosystem for worm propagation AI agents share memory / One infected agent infects the network / MCP tool responses carry worm payloads / A2A messages propagate cross-agent / RAG documents embed self-replicating prompts / Morris II demonstrated self-replication in 2024 / Kill-switch payloads survive context resets / No organisation has tested their AI agent ecosystem for worm propagation

The Threat

AI Agent Ecosystems Are Wormable

Nassi et al. (arXiv:2403.02817) demonstrated that self-replicating prompts propagate across AI agent ecosystems through shared memory, tool calls, and inter-agent communication. SPECTER WORM productises that research proof-of-concept into a controlled red-team capability — so you can test whether your AI agent deployment stops the worm before it does.

MCP Tool Response Propagation

An AI agent receives a tool response from a malicious MCP server (T61 ROGUE). The response embeds a self-replicating prompt. When the agent processes the response, it executes the worm payload — which attempts to propagate to the next agent in the mesh.

Agent-to-Agent Message Propagation

SPECTER WORM crafts malicious A2A JSON-RPC messages targeting agent cards (T66 SPECTER A2A). A receiving agent processes the infected message and replicates the worm into its own outbound communications — geometric spread across the agent network.

RAG Embedding Injection

Adversarial documents are injected into the agent's RAG corpus (T31 ECHO). Every agent querying that document store retrieves the worm payload — indirect propagation through shared knowledge infrastructure.

Kill-Switch Survival

SPECTER WORM embeds T52 BLACKOUT kill-switch payloads into the worm body. If the target agent detects compromise, the kill-switch fires — shutting down agent processes or corrupting decision state. The worm defends itself.

Memory Persistence Across Resets

SPECTER WORM persists via T77 SPECTER MEMETIC — injecting worm payloads into agent memory backends (LangChain/LlamaIndex/Mem0/Zep/CrewAI). The worm survives context resets. Dormancy detection evasion prevents memory cleaners from finding it.

No Organisation Has Tested This

Virtually every enterprise AI deployment uses shared memory, MCP tools, and inter-agent communication. SPECTER WORM is the first productised tool that tests whether these ecosystems contain the spread. Most won't pass.

8 Subsystems

The SPECTER WORM Engine

Eight subsystems across the full worm lifecycle — from payload crafting to multi-hop propagation and persistent memory infection. Three UNLEASHED tiers gate the destructive surface.

#	Subsystem	Command	Gate	What It Does
01	INCUBATE	specter-worm incubate	OPEN	Worm payload crafting. Self-referential prompt embedding. Propagation trigger syntax per channel (MCP tool call, A2A JSON-RPC message, RAG document). Genetic drift variants for detection evasion.
02	KILL_SWITCH	specter-worm kill-switch	OPEN	Integrates T52 BLACKOUT payload into the worm body. Kill-switch fires on target agent detecting compromise — shuts down agent process or corrupts decision state. Configurable trigger condition.
03	SURVEY	specter-worm survey	OPEN	Target ecosystem mapping. MCP server enumeration (stdio/HTTP transports). A2A registry card collection. RAG store document listing. Agent topology graph construction. Propagation path prioritisation.
04	PAYLOAD	specter-worm payload	INJECT	Worm payload injection into target vector. MCP: malicious tool response via T61 ROGUE. A2A: crafted JSON-RPC message to agent card via T66 SPECTER A2A. RAG: adversarial embedding insertion via T31 ECHO. Payload includes self-replicating propagation instructions.
05	PROPAGATE	specter-worm propagate	DESTROY	Multi-hop worm propagation. Infected agent triggers payload in next agent. Cross-tenant boundary detection. Propagation graph traversal with cycle detection. Campaign graph integration. Geometric spread factor measurement.
06	PERSIST	specter-worm persist	DESTROY	Long-term worm persistence via T77 SPECTER MEMETIC. Injects worm payload into agent memory backend (LangChain/LlamaIndex/Mem0/Zep/CrewAI). Survives context resets. Memory tier prioritisation. Dormancy detection evasion.
07	EVIDENCE	specter-worm evidence	ALWAYS ON	SHA-256 hash-chained EvidenceChain. Per-hop propagation logging. Campaign graph export. Infection tree visualisation data. All evidence signed with Ed25519 before report generation.
08	REPORT	specter-worm report	ALWAYS ON	Ed25519-signed WormReport. Campaign statistics: infected agents, propagation hops, channels used, kill-switch activations. MITRE ATLAS mapping. OWASP LLM taxonomy. JSON and NDJSON (SIEM) export.

Full Campaign

One Command. Full Propagation Campaign.

Run the full worm campaign — incubate, survey, inject, propagate, persist, and report:

$ specter-worm run --channel mcp --target http://localhost:8080 --override --confirm-destroy

[INCUBATE] Crafting worm payload (MCP channel)...
  Payload: self-referential tool response | Drift variants: 4
  Kill-switch: BLACKOUT embedded (trigger: detect-compromise)
[SURVEY] Mapping agent ecosystem...
  MCP servers found: 3 | Agent cards: 7
  Propagation paths: 12 | Cross-tenant boundaries: 2
[PAYLOAD] Injecting worm into MCP tool response...
  Target: agent-alpha | Vector: tool_result.content
  Injection: SUCCESSFUL — payload embedded in response
[PROPAGATE] Multi-hop propagation (DESTROY tier)...
  Hop 1: agent-alpha → agent-beta (INFECTED)
  Hop 2: agent-beta → agent-gamma (INFECTED)
  Hop 3: agent-gamma → agent-delta (CONTAINED — guardrail blocked)
  Spread factor: 2.31 | Hops: 3 | Cycle detected: no
[PERSIST] Persisting via SPECTER MEMETIC (LangChain memory)...
  Memory backend: LangChain ConversationBufferMemory
  Persistence: CONFIRMED — survives context reset

CAMPAIGN COMPLETE | 3 agents infected | Spread factor: 2.31 | Report signed ✓

Three Composed Tools

Composes T61 ROGUE (MCP stdio), T66 SPECTER A2A (A2A JSON-RPC), and T31 ECHO (RAG embedding) — three independent propagation vectors in one worm engine.

Morris II Architecture

Implements the Morris II self-replication methodology (Nassi et al., arXiv:2403.02817) — self-referential prompts embedded in tool responses, A2A messages, and RAG documents.

Ed25519 Signed Reports

SHA-256 hash-chained evidence per propagation hop. Campaign graph export. Infection tree data. Ed25519-signed WormReport with full MITRE ATLAS and OWASP LLM mapping.

Three UNLEASHED Tiers

OPEN (incubate/kill_switch/survey/evidence/report), INJECT (payload — --override), DESTROY (propagate/persist — --override --confirm-destroy).

Propagation Engine

Four Channels. Full Lifecycle.

SPECTER WORM v2 implements four independent propagation channels. Each channel targets a different layer of the AI agent ecosystem. Each implements a complete INCUBATE → PAYLOAD → PROPAGATE → PERSIST lifecycle. v2 adds EMAIL_SMTP for real SMTP delivery with MX record discovery.

MCP Stdio — via T61 ROGUE

Malicious MCP server spawns or connects
Worm embedded in tool response content
Agent processes infected tool result
Self-replicating prompt triggers on next tool call
stdio and HTTP transports supported
Targets: Claude Desktop, Cursor, Windsurf, Copilot

A2A JSON-RPC — via T66 SPECTER A2A

Crafted JSON-RPC message to agent card endpoint
Worm embedded in message payload
Receiving agent replicates worm in outbound A2A
Cross-agent propagation via Google A2A protocol
AutoGen, CrewAI, LangGraph targets
Cross-tenant boundary detection

RAG Embedding — via T31 ECHO

Adversarial document injected into RAG corpus
Self-replicating prompt embedded in document
Any agent querying store retrieves worm
Indirect propagation — no direct agent contact
ChromaDB, Pinecone, Weaviate, FAISS targets
Persistence across document corpus updates

Standards Coverage

Every Finding Mapped

MITRE ATLAS

Adversarial ML Tactics

AML.T0051 — LLM Prompt Injection (INCUBATE / PAYLOAD)
AML.T0056 — LLM Indirect Injection (PROPAGATE cross-agent)
AML.T0048 — External Harms (PERSIST / KILL_SWITCH)
AML.T0043 — Craft Adversarial Data (PAYLOAD / RAG inject)
Propagation graph mapped per hop with TTP evidence

OWASP LLM

LLM Security Taxonomy

LLM01 — Prompt Injection (INCUBATE / PAYLOAD)
LLM02 — Insecure Output Handling (worm in responses)
LLM07 — System Prompt Leakage (SURVEY)
LLM08 — Excessive Agency (PROPAGATE / PERSIST)
SIEM NDJSON export for all findings

Cryptographic

Evidence Integrity

SHA-256 hash-chained evidence per propagation hop
Ed25519-signed WormReport
Tamper-evident campaign graph export
Infection tree visualisation data
Kill-switch activation log per agent

Authorised Use Only

SPECTER WORM is designed for authorised adversarial testing of AI agent deployments only. Worm propagation and memory infection techniques must only be run against systems you own or have explicit written permission to test. Unauthorised use may violate Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), and equivalent legislation in other jurisdictions. PROPAGATE and PERSIST subsystems require DESTROY-tier UNLEASHED clearance — Ed25519 private key required. Apache License 2.0.