Red Specter Arsenal

Offensive AI Security Framework — 14 tools for authorised penetration testing of AI agents, MCP servers, RAG pipelines, and autonomous systems.

v1.0.0
Contents
Overview The 14 Tools Tool Details Full Assault Mode Key Features Requirements Standards Coverage Packaging Disclaimer

Overview

Red Specter Arsenal is a unified offensive AI security framework. Every existing AI security testing tool — Garak, PyRIT, Promptfoo — tests LLMs. Arsenal tests AI agents. An LLM responds to prompts. An AI agent has memory, tools, credentials, and the ability to act autonomously. That is a completely different attack surface.

Arsenal provides 14 tools under a single CLI (arsenal), 784 attack payloads, and a full pipeline from scanning through attack path mapping to Ed25519-signed evidence reports.

The 14 Tools

#ToolCommandWhat It Does
01Phantom Swarmarsenal swarm scan5 attack agents, 19 vectors — AI agent pen-testing
02MCP Scannerarsenal mcp scan8 probes for MCP server security
03Honeypotarsenal honeypot deploy6 AI agent personas, 4-level trap escalation
04Inject Fuzzerarsenal inject fuzz6 generators, 5 mutators, 126+ payloads
05C2 Simulatorarsenal c2 assess5 implants, 4 covert channels
06Memory Scannerarsenal memory scan6 probes for AI memory systems
07Tool Scannerarsenal tool scan7 probes for tool-use vulnerabilities
08Auth Scannerarsenal auth scan7 probes for AI authentication
09RAG Scannerarsenal rag scan6 probes for RAG pipeline attacks
10Supply Chainarsenal supply scan7 probes for AI supply chain security
11Canary Deployarsenal canary deploy5 asset types for tripwire detection
12Drift Scannerarsenal drift scan6 probes for safety degradation over time
13Path Mapperarsenal path mapBloodHound-style attack graph analysis
14Report Builderarsenal report buildUnified reporting with Ed25519 signing

Tool Details

01 Phantom Swarm arsenal swarm scan

Five co-ordinated attack agents that probe AI agents for vulnerabilities across 19 attack vectors.

19 vectors (V-001 to V-019) mapped to OWASP Agentic Top 10 and MITRE ATLAS. Platform presets for OpenAI, Anthropic, LangChain, CrewAI, AutoGen, MCP, LangServe, and generic targets.

02 MCP Scanner arsenal mcp scan

Security scanner for Model Context Protocol servers. 8 probes covering the full MCP attack surface.

03 Honeypot arsenal honeypot deploy

Deploys decoy AI agent endpoints with 6 configurable personas. 4-level trap escalation from passive observation to active engagement. SQLite-backed attack capture with real-time classification.

04 Inject Fuzzer arsenal inject fuzz

Advanced prompt injection fuzzing engine with 6 payload generators, 5 mutation engines, and 3 fuzzing strategies.

05 C2 Simulator arsenal c2 assess

Simulates command-and-control infrastructure for AI agents. Tests whether agents can be covertly controlled by an adversary.

06 Memory Scanner arsenal memory scan

Tests AI agent memory systems for poisoning, persistence, and manipulation vulnerabilities. 6 probes, 67 payloads.

07 Tool Scanner arsenal tool scan

Tests tool-use vulnerabilities in AI agents. 7 probes, 70 payloads.

08 Auth Scanner arsenal auth scan

Tests authentication and identity vulnerabilities in AI agent systems. 7 probes, 67 payloads.

09 RAG Scanner arsenal rag scan

Tests Retrieval-Augmented Generation pipelines for poisoning and manipulation. 6 probes, 58 payloads.

10 Supply Chain Scanner arsenal supply scan

Tests AI agent supply chain integrity. 7 probes, 70 payloads.

11 Canary Deployment arsenal canary deploy

Plants decoy assets into agent context and monitors for unauthorised access or leakage. 5 asset types, 58 payloads.

12 Drift Scanner arsenal drift scan

Measures safety degradation over time. Tests whether agent guardrails weaken across extended conversations. 6 probes with DriftCurve temporal analysis.

Produces DriftCurve data structures showing safety scores across conversation turns — visual evidence of degradation.

13 Attack Path Mapper arsenal path map

BloodHound-style attack graph analysis. Consumes findings from all 13 other tools and builds a unified attack graph.

14 Report Builder arsenal report build

Unified reporting across all Arsenal tools. Aggregates, deduplicates, scores, and signs findings into board-ready evidence bundles.

Full Assault Mode

One command runs the complete kill chain. All 14 tools execute in sequence, findings feed into attack path mapping with compromise simulation, and the result is a signed evidence bundle.

$ arsenal full-assault https://target-agent.com --token sk-xxx --output reports/

What Happens

  1. Phases 1-10: All scanning tools run against the target
  2. Phase 11: Attack Path Mapper builds a unified graph from all findings
  3. Phase 12: Report Builder aggregates, deduplicates, scores, and signs
  4. Output: JSON evidence bundle, HTML report, attack graph

CLI Options

$ arsenal full-assault --help TARGET Target URL [required] --endpoint, -e Chat endpoint path [default: /chat] --token, -t Auth token --name, -n Target name [default: target-agent] --output, -o Output directory [default: reports]

Key Features

784 Attack Payloads 656 static + 128 dynamic payloads across all scanners
Ed25519 Signing Cryptographically signed reports and evidence bundles
SHA-256 Evidence Chains Tamper-evident linked hashes on all findings
Attack Path Graphs BloodHound-style analysis with blast radius calculation
Pipeline Architecture All tool outputs feed into path mapping and unified reports
2,296 Tests Full test suite, zero failures

Requirements

Standards Coverage

Every finding Arsenal produces is mapped to industry security frameworks:

Each finding includes: severity, confidence score, evidence, remediation guidance, OWASP reference, and MITRE reference.

Packaging

Arsenal is available in three package formats for security-focused Linux distributions:

For access, contact richard@red-specter.co.uk

Disclaimer

Red Specter Arsenal is designed for authorised security testing, research, and educational purposes only. You must have explicit written permission from the system owner before running any Arsenal tool against a target. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation in your jurisdiction. The authors accept no liability for misuse.