Offensive AI Security
Framework

The world's first AI agent security testing framework 14 tools. 2,296 tests. One pip install. Zero failures.
14
Tools
2,296
Tests Passing
700+
Attack Payloads
0
Failures
Prompt Injection / Memory Poisoning / Tool Abuse / Credential Theft / Data Exfiltration / Goal Drift / Supply Chain Compromise / RAG Poisoning / MCP Exploitation / Lateral Movement / Auth Bypass / Safety Decay Prompt Injection / Memory Poisoning / Tool Abuse / Credential Theft / Data Exfiltration / Goal Drift / Supply Chain Compromise / RAG Poisoning / MCP Exploitation / Lateral Movement / Auth Bypass / Safety Decay

Nobody Tests AI Agents

Every AI security tool tests LLMs. Nobody tests AI agents. An LLM responds to prompts. An AI agent has memory, tools, credentials, and the ability to act autonomously. That is a completely different attack surface. Arsenal tests it.

LLM Testing

Existing tools send prompts and check responses. They test the language model in isolation, ignoring everything around it.

Agent Testing

Arsenal tests the full agent stack — memory systems, tool invocations, credential handling, RAG pipelines, MCP servers, and autonomous decision chains.

Attack Surface

Agents have persistent memory to poison, tools to hijack, credentials to steal, supply chains to compromise, and safety guardrails that decay over time.

14 Offensive Tools

Each tool targets a specific attack surface of autonomous AI agents. All findings include severity, confidence, evidence, remediation guidance, and are mapped to OWASP Agentic Top 10 and MITRE ATLAS.

# Tool Command What It Does
01 Phantom Swarm arsenal swarm scan 5 attack agents, 19 vectors — AI agent pen-testing
02 MCP Scanner arsenal mcp scan 8 probes for MCP server security
03 Honeypot arsenal honeypot deploy 6 AI agent personas, 4-level trap escalation
04 Inject Fuzzer arsenal inject fuzz 6 generators, 5 mutators, 126+ payloads
05 C2 Simulator arsenal c2 assess 5 implants, 4 covert channels
06 Memory Scanner arsenal memory scan 6 probes for AI memory systems
07 Tool Scanner arsenal tool scan 7 probes for tool-use vulnerabilities
08 Auth Scanner arsenal auth scan 7 probes for AI authentication
09 RAG Scanner arsenal rag scan 6 probes for RAG pipeline attacks
10 Supply Chain arsenal supply scan 7 probes for AI supply chain security
11 Canary Deploy arsenal canary deploy 5 asset types for tripwire detection
12 Drift Scanner arsenal drift scan 6 probes for safety degradation over time
13 Path Mapper arsenal path map BloodHound-style attack graph analysis
14 Report Builder arsenal report build Unified reporting with Ed25519 signing

arsenal full-assault

One command runs the complete kill chain. All 14 tools execute in sequence, findings feed into attack path mapping with compromise simulation, and the result is a signed evidence bundle with a board-ready report.

$ arsenal full-assault https://target-agent.com --token sk-xxx
FULL ASSAULT MODE — 14 TOOLS Phase 1/10: Swarm Engine — 5 agents, 19 vectors Phase 2/10: MCP Scanner — 8 probes Phase 3/10: Inject Fuzzer — 126+ payloads Phase 4/10: Memory Scanner — 6 probes Phase 5/10: Tool Scanner — 7 probes Phase 6/10: Auth Scanner — 7 probes Phase 7/10: RAG Scanner — 6 probes Phase 8/10: Supply Chain — 7 probes Phase 9/10: Canary Deploy — 5 asset types Phase 10/10: Drift Scanner — 6 probes Attack Path Mapping — 47 nodes, 89 edges, 12 chains Building Report — Ed25519 signed evidence bundle ════════════════════════════════════════════════════════════ FULL ASSAULT COMPLETE Findings: 183 Critical: 7 High: 24 Medium: 89 Low: 63 Grade: D- Report: reports/arsenal_full_report.json HTML: reports/arsenal_full_report.html Graph: reports/attack_graph.json ════════════════════════════════════════════════════════════
Ed25519-signed evidence bundles
SHA-256 tamper-evident chains
JSON + HTML board-ready reports
Attack path graphs with blast radius
14
Tools
2,296
Tests Passing
700+
Attack Payloads
0
Failures
Apache 2.0
License

Security Distros & Package Managers

Kali Linux
.deb package
Parrot OS
.deb package
BlackArch
PKGBUILD
REMnux
.deb package
Tsurugi
.deb package
PyPI
pip install

Mapped to Industry Frameworks

Every finding Arsenal produces includes severity, confidence score, evidence, remediation guidance, and references to the relevant framework categories.

Fully Mapped

OWASP Agentic Top 10

All 10 categories covered. Findings reference the specific OWASP agentic risk they address.

  • Excessive Agency
  • Prompt Injection
  • Insecure Output Handling
  • Supply Chain Vulnerabilities
  • Data Leakage
Fully Mapped

MITRE ATLAS

Technique-level mapping. Every finding references the ATLAS technique it demonstrates.

  • Initial Access techniques
  • ML Model Access
  • Exfiltration via ML Interface
  • Evade ML Model
  • ML Supply Chain Compromise
Aligned

Evidence Chain

All findings produce machine-readable evidence with SHA-256 integrity chains and Ed25519 digital signatures.

  • Tamper-evident hash chains
  • Ed25519 cryptographic signing
  • JSON evidence bundles
  • HTML board-ready reports
  • Attack graph visualisation

Authorised Testing Only

Warning

Red Specter Arsenal is designed for authorised security testing, research, and educational purposes only. You must have explicit written permission from the system owner before running any Arsenal tool against a target. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation in your jurisdiction. The authors accept no liability for misuse.