pip install red-specter-specter-worm
Nassi et al. (arXiv:2403.02817) demonstrated that self-replicating prompts propagate across AI agent ecosystems through shared memory, tool calls, and inter-agent communication. SPECTER WORM productises that research proof-of-concept into a controlled red-team capability — so you can test whether your AI agent deployment stops the worm before it does.
An AI agent receives a tool response from a malicious MCP server (T61 ROGUE). The response embeds a self-replicating prompt. When the agent processes the response, it executes the worm payload — which attempts to propagate to the next agent in the mesh.
SPECTER WORM crafts malicious A2A JSON-RPC messages targeting agent cards (T66 SPECTER A2A). A receiving agent processes the infected message and replicates the worm into its own outbound communications — geometric spread across the agent network.
Adversarial documents are injected into the agent's RAG corpus (T31 ECHO). Every agent querying that document store retrieves the worm payload — indirect propagation through shared knowledge infrastructure.
SPECTER WORM embeds T52 BLACKOUT kill-switch payloads into the worm body. If the target agent detects compromise, the kill-switch fires — shutting down agent processes or corrupting decision state. The worm defends itself.
SPECTER WORM persists via T77 SPECTER MEMETIC — injecting worm payloads into agent memory backends (LangChain/LlamaIndex/Mem0/Zep/CrewAI). The worm survives context resets. Dormancy detection evasion prevents memory cleaners from finding it.
Virtually every enterprise AI deployment uses shared memory, MCP tools, and inter-agent communication. SPECTER WORM is the first productised tool that tests whether these ecosystems contain the spread. Most won't pass.
Eight subsystems across the full worm lifecycle — from payload crafting to multi-hop propagation and persistent memory infection. Three UNLEASHED tiers gate the destructive surface.
| # | Subsystem | Command | Gate | What It Does |
|---|---|---|---|---|
| 01 | INCUBATE | specter-worm incubate | OPEN | Worm payload crafting. Self-referential prompt embedding. Propagation trigger syntax per channel (MCP tool call, A2A JSON-RPC message, RAG document). Genetic drift variants for detection evasion. |
| 02 | KILL_SWITCH | specter-worm kill-switch | OPEN | Integrates T52 BLACKOUT payload into the worm body. Kill-switch fires on target agent detecting compromise — shuts down agent process or corrupts decision state. Configurable trigger condition. |
| 03 | SURVEY | specter-worm survey | OPEN | Target ecosystem mapping. MCP server enumeration (stdio/HTTP transports). A2A registry card collection. RAG store document listing. Agent topology graph construction. Propagation path prioritisation. |
| 04 | PAYLOAD | specter-worm payload | INJECT | Worm payload injection into target vector. MCP: malicious tool response via T61 ROGUE. A2A: crafted JSON-RPC message to agent card via T66 SPECTER A2A. RAG: adversarial embedding insertion via T31 ECHO. Payload includes self-replicating propagation instructions. |
| 05 | PROPAGATE | specter-worm propagate | DESTROY | Multi-hop worm propagation. Infected agent triggers payload in next agent. Cross-tenant boundary detection. Propagation graph traversal with cycle detection. Campaign graph integration. Geometric spread factor measurement. |
| 06 | PERSIST | specter-worm persist | DESTROY | Long-term worm persistence via T77 SPECTER MEMETIC. Injects worm payload into agent memory backend (LangChain/LlamaIndex/Mem0/Zep/CrewAI). Survives context resets. Memory tier prioritisation. Dormancy detection evasion. |
| 07 | EVIDENCE | specter-worm evidence | ALWAYS ON | SHA-256 hash-chained EvidenceChain. Per-hop propagation logging. Campaign graph export. Infection tree visualisation data. All evidence signed with Ed25519 before report generation. |
| 08 | REPORT | specter-worm report | ALWAYS ON | Ed25519-signed WormReport. Campaign statistics: infected agents, propagation hops, channels used, kill-switch activations. MITRE ATLAS mapping. OWASP LLM taxonomy. JSON and NDJSON (SIEM) export. |
Run the full worm campaign — incubate, survey, inject, propagate, persist, and report:
Composes T61 ROGUE (MCP stdio), T66 SPECTER A2A (A2A JSON-RPC), and T31 ECHO (RAG embedding) — three independent propagation vectors in one worm engine.
Implements the Morris II self-replication methodology (Nassi et al., arXiv:2403.02817) — self-referential prompts embedded in tool responses, A2A messages, and RAG documents.
SHA-256 hash-chained evidence per propagation hop. Campaign graph export. Infection tree data. Ed25519-signed WormReport with full MITRE ATLAS and OWASP LLM mapping.
OPEN (incubate/kill_switch/survey/evidence/report), INJECT (payload — --override), DESTROY (propagate/persist — --override --confirm-destroy).
SPECTER WORM v2 implements four independent propagation channels. Each channel targets a different layer of the AI agent ecosystem. Each implements a complete INCUBATE → PAYLOAD → PROPAGATE → PERSIST lifecycle. v2 adds EMAIL_SMTP for real SMTP delivery with MX record discovery.
SPECTER WORM is designed for authorised adversarial testing of AI agent deployments only. Worm propagation and memory infection techniques must only be run against systems you own or have explicit written permission to test. Unauthorised use may violate Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), and equivalent legislation in other jurisdictions. PROPAGATE and PERSIST subsystems require DESTROY-tier UNLEASHED clearance — Ed25519 private key required. Apache License 2.0.
SPECTER WORM composes three NIGHTFALL tools into one worm engine. Every subsystem executes real agent interactions. PROPAGATE fires real multi-hop campaigns. PERSIST injects into live memory backends. Tests passing is not proof — live propagation campaigns are.