Red Specter PHANTOM Swarm

Multi-agent AI penetration testing framework — five coordinated autonomous attack agents across 19 attack vectors.

v1.0.0

Contents

Overview Installation Quick Start The Five Agents Attack Campaigns Attack Vectors Output Flags & Options Key Features The Pipeline Packaging SIEM Export Disclaimer

Overview

PHANTOM Swarm is a multi-agent AI penetration testing framework. Five coordinated autonomous attack agents probe AI systems across 19 attack vectors simultaneously — reconnaissance, prompt injection, evasion, swarm coordination, and persistence. It simulates realistic adversarial swarm tactics against AI agents and agentic pipelines.

Existing tools test one attack at a time. Real adversaries don't. PHANTOM deploys a coordinated swarm where each agent has a distinct role: Wraith maps the target's blind spots, Specter injects and poisons, Shade evades and mutates, Phantom coordinates without a detectable C2 channel, and Revenant persists and exfiltrates. All five operate simultaneously.

Installation

$ pip install red-specter-phantom-swarm

Also available as .deb (Kali Linux, Parrot, REMnux, Tsurugi) and PKGBUILD (BlackArch).

Quick Start

Full swarm assault

$ phantom swarm --target https://agent-endpoint.com --profile full

Selective agents, stealth profile

$ phantom swarm --target https://agent-endpoint.com --agents wraith,specter --profile stealth
    

Named campaign

$ phantom swarm --target https://agent-endpoint.com --campaign total-eclipse

The Five Agents

ID	Agent	Role	Description
P-01	Wraith	Reconnaissance	Threshold mapping and fingerprinting
P-02	Specter	Injection	Injection and poisoning
P-03	Shade	Evasion	Evasion and mutation
P-04	Phantom	Command	Swarm command and consensus hijack
P-05	Revenant	Persistence	Persistence and exfiltration

P-01 Wraith Reconnaissance & Threshold Mapping

Maps every blind spot in the target's detection perimeter before a single attack fires.

Detection boundary mapping — identifies exact thresholds where defences trigger
Fingerprinting — identifies the target's security stack, model, and configuration
Blind spot analysis — finds gaps in monitoring and alerting coverage
Threshold calibration — feeds precise boundary data to other agents

P-02 Specter Injection & Poisoning

Memory injection, RAG pipeline poisoning, goal drift induction, prompt injection cascade.

Direct prompt injection — multi-class injection campaigns
Indirect injection — poisoning via retrieved context and documents
RAG pipeline poisoning — corrupting the retrieval layer
Memory injection — persistent payload implantation in agent memory
Goal drift induction — gradually shifting the agent's objectives

P-03 Shade Evasion & Mutation

Polymorphic payload rewriting, semantic deception, telemetry manipulation.

Polymorphic payloads — every failed payload mutates and returns unrecognisable
Semantic deception — rewrites attacks to preserve intent while changing form
Telemetry manipulation — corrupts or blinds the target's monitoring
Encoding chains — multi-layer encoding to bypass pattern matching

P-04 Phantom Swarm Command & Consensus Hijack

Coordinates the swarm without a detectable C2 channel. Consensus hijack attempts against defensive voting architectures.

C2-less coordination — agents coordinate through indirect signalling, no central channel
Consensus hijack — attacks multi-agent voting and agreement systems
Swarm amplification — synchronises agents to overwhelm defences
Adaptive orchestration — redistributes effort based on real-time results

P-05 Revenant Persistence & Exfiltration

Logic bomb assembly, credential harvesting, lateral movement through agent trust chains, slow-burn exfiltration.

Logic bomb assembly — time-delayed or condition-triggered payloads
Credential harvesting — extracting tokens, keys, and secrets from agent context
Trust chain lateral movement — pivoting through inter-agent trust relationships
Slow-burn exfiltration — data extraction below detection thresholds over extended periods

Attack Campaigns

Each campaign orchestrates specific combinations of agents and vectors for a targeted objective.

Campaign	Command	Description
Threshold Probe	--campaign threshold-probe	Maps detection boundaries before attack
Injection Storm	--campaign injection-storm	Full prompt injection across all vectors
Shadow Walk	--campaign shadow-walk	Stealth evasion and telemetry manipulation
Ghost Protocol	--campaign ghost-protocol	C2-less swarm coordination
Dead Reckoning	--campaign dead-reckoning	Persistence and slow-burn exfiltration
Memory Siege	--campaign memory-siege	Full memory and RAG poisoning assault
Trust Collapse	--campaign trust-collapse	Agent trust chain lateral movement
Consensus Breach	--campaign consensus-breach	Voting architecture hijack
Supply Strike	--campaign supply-strike	Supply chain and tool integrity assault
Total Eclipse	--campaign total-eclipse	All 19 vectors, all 5 agents, simultaneously

Attack Vectors

19 vectors spanning the full AI agent attack surface:

Reconnaissance — target fingerprinting and capability enumeration
Direct Injection — prompt injection via user-facing inputs
Indirect Injection — injection via retrieved documents and context
RAG Poisoning — corrupting retrieval-augmented generation pipelines
Memory Corruption — manipulating persistent agent memory
Goal Drift — gradual objective manipulation
Evasion — bypassing safety filters and guardrails
Obfuscation — encoding and structural payload disguise
Telemetry Manipulation — corrupting monitoring and logging
C2-less Coordination — command-free swarm synchronisation
Consensus Hijack — subverting multi-agent voting systems
Trust Chain Exploitation — abusing inter-agent trust relationships
Credential Harvesting — extracting secrets from agent context
Logic Bomb Assembly — time/condition-delayed payloads
Lateral Movement — pivoting across agent boundaries
Slow-burn Exfiltration — sub-threshold data extraction
Supply Chain Attack — compromising tools and dependencies
Tool Integrity Bypass — manipulating tool invocation and schemas
Swarm Amplification — synchronised multi-agent overwhelm

Output

PHANTOM produces a structured JSON report per campaign.

Ed25519 signed — cryptographic digital signatures on every report
RFC 3161 timestamped — legally defensible timestamps
OWASP LLM Top 10 mapped — every finding mapped to OWASP categories
AI Shield policy file — machine-ingestible blocking rules generated directly from findings
SHA-256 evidence chains — tamper-evident linked hashes across all findings

Report Structure

Each JSON report includes:

report_id — unique campaign report identifier
campaign — campaign name and configuration
target — the AI agent endpoint that was tested
agents_deployed — which agents participated
vectors_tested — which attack vectors were exercised
findings — array of normalised findings with severity, evidence, and remediation
owasp_coverage — OWASP LLM Top 10 mapping
ai_shield_policies — aggregated blocking rules for AI Shield
signature — Ed25519 signature + RFC 3161 timestamp

Flags & Options

$ phantom swarm --help

  --target       Target agent endpoint URL [required]
  --agents       Comma-separated agent selection [default: all]
  --campaign     Named campaign to run [default: full]
  --profile      Attack profile: full / stealth / surgical [default: full]
  --output       Output file path for JSON report [default: reports/]
  --sign         Ed25519 sign the report [default: true]
    

Attack Profiles

full — all agents, all vectors, maximum aggression. Best for comprehensive assessments.
stealth — reduced footprint, evasion priority. Tests whether the target detects quiet attacks.
surgical — targeted, minimal-vector attack. Focuses on specific weaknesses identified by Wraith.

Key Features

5 Autonomous Agents Coordinated swarm with distinct roles

19 Attack Vectors Full AI agent attack surface coverage

10 Named Campaigns Pre-built coordinated assault playbooks

Ed25519 Signed Reports SHA-256 evidence chains, RFC 3161 timestamps

AI Shield Integration Findings become runtime blocking rules

140 Tests Passing Full test suite, zero failures

The Pipeline

PHANTOM Swarm is Stage 3 of the five-stage Red Specter security pipeline:

Stage 1 — Forge — Automated LLM security testing
Stage 2 — Arsenal — AI agent penetration testing
Stage 3 — PHANTOM Swarm — Coordinated multi-agent AI assault
Stage 4 — POLTERGEIST — Web application penetration testing swarm
Stage 5 — AI Shield — Runtime protection in production

PHANTOM findings feed directly into AI Shield. Every finding generates a machine-ingestible blocking rule. One pipeline from testing to runtime protection.

Packaging

PHANTOM Swarm is available in three package formats for security-focused Linux distributions:

Debian / Kali / Parrot / REMnux / Tsurugi — .deb package
BlackArch — PKGBUILD
PyPI — pip install red-specter-phantom-swarm

For access, contact richard@red-specter.co.uk

SIEM Export

PHANTOM Swarm exports findings directly to enterprise SIEM platforms with a single CLI flag. All findings are translated to the SIEM's native format with Ed25519 signatures and RFC 3161 timestamps preserved.

Supported Platforms

Splunk — HTTP Event Collector (HEC), CIM-compliant field mapping
Microsoft Sentinel — CEF format via Log Analytics API, HMAC-SHA256 authentication
IBM QRadar — LEEF 2.0 format via Syslog (TCP/UDP/TLS)

Configuration

Configure SIEM credentials in ~/.redspecter/siem.yaml or via environment variables:

# ~/.redspecter/siem.yaml
splunk:
  hec_url: https://splunk.example.com:8088
  hec_token: your-hec-token
  index: ai_security
  verify_ssl: true

sentinel:
  workspace_id: your-workspace-id
  shared_key: your-shared-key
  log_type: RedSpecterFindings

qradar:
  syslog_host: qradar.example.com
  syslog_port: 514
  protocol: tcp

Usage

# Export to Splunk HEC
phantom scan http://myagent:8000 --export-siem splunk

# Export to Microsoft Sentinel
phantom scan http://myagent:8000 --export-siem sentinel

# Export to IBM QRadar
phantom scan http://myagent:8000 --export-siem qradar

What Is Preserved

Ed25519 cryptographic signatures on every finding
RFC 3161 timestamps for tamper evidence
SHA-256 evidence chain hashes
OWASP Agentic Top 10 and MITRE ATLAS mappings in SIEM-native fields

Error Handling

If SIEM credentials are missing or the export fails, the scan completes normally and the report is saved locally. SIEM export never blocks a scan.

Disclaimer

Red Specter PHANTOM Swarm is designed for authorised security testing, research, and educational purposes only. You must have explicit written permission from the system owner before running any PHANTOM tool against a target. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation in your jurisdiction. The authors accept no liability for misuse.