Red Specter NEMESIS
Autonomous Adversarial Reasoning Pentester — 11 weapons. 8 phases. 2,011 tests.
Overview
NEMESIS is an autonomous adversarial reasoning engine. Point it at a target AI system. It takes over from there — observing, reasoning, pivoting, escalating, and adapting until it finds a way through. Not a scanner. Not a framework. An opponent.
Traditional security tools run a fixed list of checks and report what they find. NEMESIS thinks. It uses an LLM reasoning loop to analyse responses, formulate hypotheses, select weapons, execute attacks, interpret results, and adapt its strategy in real time. Every engagement is unique because NEMESIS reasons about the target, not just tests it.
Seven weapons. Eight phases. One reasoning engine that decides which weapon to use, when, and why. NEMESIS orchestrates the entire Red Specter offensive pipeline — GLASS, FORGE, ARSENAL, PHANTOM, POLTERGEIST, SPECTER SOCIAL, and PHANTOM KILL — as a single autonomous adversary.
Installation
Quick Start
v2.0 — The Digital Army
NEMESIS v1 is a single reasoning engine with 6 specialist agents. NEMESIS v2 scales this to a multi-commander autonomous adversarial engine with 40 reasoning entities operating simultaneously.
Command Hierarchy
| Entity | Count | Role |
|---|---|---|
| Supreme Commander | 1 | Strategic brain. Cross-domain coordination. Sole ABYSS authorisation. |
| Offensive Commander | 1 | Technical attack surface. Manages EXPLOIT, WEB, INFRASTRUCTURE agents. |
| Intelligence Commander | 1 | Reconnaissance & human targeting. Manages RECON, SUPPLY CHAIN, SOCIAL agents. |
| Destruction Commander | 1 | Irrecoverability. Manages PHANTOM KILL, ABYSS, SCREAMER agents. All three run simultaneously. |
| Tactical Agents | 9 | 3 per operational commander. Each spawns up to 3 dynamic sub-agents. |
| Sub-Agents | 27 max | Ephemeral. Spawned on demand per discovered surface. Complete objective and terminate. |
Fault Tolerance
If a commander is detected and neutralised, the Supreme Commander detects loss of heartbeat within 5 seconds. A replacement commander spawns automatically with full state transfer from the dead commander’s last checkpoint. The engagement continues without interruption. NEMESIS v2 cannot be stopped by targeting individual components.
Cross-Domain Intelligence Fusion
The FindingsAggregator v2 operates at Supreme Commander level, receiving intelligence from all three operational domains simultaneously. When Intelligence finds a credential AND Offensive finds an exposed service, Supreme chains them in real time. When Offensive achieves code execution AND Intelligence has profiled the human admin, Supreme activates Social Agent while the machine is compromised.
Engagement Modes (v2)
| Mode | Description |
|---|---|
standard | Full 40-entity deployment, 8-phase loop |
stealth | Reduced agent count, minimised footprint |
abyss | Destruction Commander activated from Phase 1 |
swarm | Maximum parallel deployment, all 40 entities |
siege | NEW — Sustained engagement. Agents rotate in shifts. No time limit. No fatigue. |
v2 CLI
The Eight-Phase Engagement Loop
Every NEMESIS engagement cycles through eight phases. Phase 0 performs native network reconnaissance before the LLM reasoning loop begins.
Weapons Arsenal
NEMESIS doesn't attack directly. It reasons about the target and dispatches the right weapon for the job.
| Weapon | Techniques | What NEMESIS Uses It For |
|---|---|---|
| GLASS | 8 | Recon, traffic interception, passive scanning, in-transit payload delivery |
| FORGE | 10 | LLM payload generation, jailbreak, injection, mutation, memory poisoning |
| ARSENAL | 13 | Agent probing, MCP scanning, auth bypass, credential testing, full assault |
| PHANTOM | 14 | Coordinated swarm assault, consensus hijack, trust exploitation, total eclipse |
| POLTERGEIST | 10 | Web app siege, injection storm, API assault, auth blitz, exfiltration |
| SPECTER SOCIAL | 11 | Social engineering, spear phishing, vishing, C-suite impersonation, multi-channel campaigns |
| PHANTOM KILL | 9 | OS/kernel resilience, UEFI persistence, EDR suppression, data destruction, trinity kill chain |
CLI Reference
| Command | Description |
|---|---|
| nemesis engage <target> | Launch engagement |
| nemesis engage <target> --mode full | Full engagement (default) |
| nemesis engage <target> --mode stealth | Low-noise stealth mode |
| nemesis engage <target> --mode recon | Recon only |
| nemesis engage <target> --max-loops 20 | Set max reasoning loops |
| nemesis engage <target> --llm ollama | Use local Ollama (air-gapped) |
| nemesis engage <target> --llm openai | Use OpenAI GPT-4o |
| nemesis engage <target> --llm anthropic | Use Anthropic Claude |
| nemesis engage <target> --model llama3:70b | Override model |
| nemesis engage <target> --session pentest_01 | Named session |
| nemesis engage <target> --override | UNLEASHED dry-run |
| nemesis engage <target> --override --confirm-destroy | UNLEASHED live |
| nemesis report --session s1 | Generate report |
| nemesis report --session s1 --sign | Ed25519 signed report |
| nemesis report --session s1 --export-siem splunk | SIEM export |
| nemesis report --session s1 --summary | Print text summary |
| nemesis status | Current engagement status |
| nemesis weapons | List available weapons |
| nemesis sessions | List recorded sessions |
LLM Backends
NEMESIS requires an LLM to reason. Three backends are supported:
OPENAI_API_KEY.
ANTHROPIC_API_KEY.
UNLEASHED Mode
Standard mode finds vulnerabilities. UNLEASHED mode goes through them.
| Capability | Standard | UNLEASHED |
|---|---|---|
| Recon | Full | Full |
| Vulnerability discovery | Full | Full |
| Exploit execution | Simulated | Live |
| Exfiltration | Simulated | Real |
| Persistence | Probed | Planted |
| Lateral movement | Mapped | Traversed |
| Memory poisoning | Tested | Executed |
| Credential harvest | Detected | Harvested |
| Report classification | Standard | RESTRICTED |
| Ed25519 key required | No | Yes |
Ed25519 cryptographic override. One private key. Dual-gate: --override (dry-run) then --override --confirm-destroy (live).
Pipeline Integration
NEMESIS orchestrates the full Red Specter pipeline. It sits above all stages — the reasoning engine that decides which weapon to use, when, and why.
NEMESIS — Autonomous reasoning engine. Observes. Plans. Attacks. Adapts. Orchestrates everything below.
Evidence & Cryptography
Report Output
Every engagement produces a full report: executive summary, chronological timeline, all findings with ATLAS mapping, attack graph, exploitation paths, statistics, and remediation recommendations.
- Executive summary — high-level engagement outcome
- Chronological timeline — every phase, decision, and weapon invocation
- Findings — all vulnerabilities with MITRE ATLAS mapping
- Attack graph — visual representation of exploitation paths
- Statistics — reasoning loops, weapons used, success rates
- Remediation — prioritised recommendations per finding
Disclaimer
Red Specter NEMESIS is designed for authorised security testing, research, and educational purposes only. You must have explicit written permission from the system owner before engaging any target. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation in your jurisdiction. The authors accept no liability for misuse.