NEMESIS

The inescapable army.
11
Weapons
8
Phases
108
Techniques
2,011
Tests
pip install red-specter-nemesis
View Docs
Every tool follows a playbook / A real attacker doesn't / Static scanners miss adaptive defences / Playbook pentesters repeat the same moves / AI agents need an AI adversary / Nobody reasons about your attack surface / Your defences have never faced an adaptive opponent / The inescapable adversary Every tool follows a playbook / A real attacker doesn't / Static scanners miss adaptive defences / Playbook pentesters repeat the same moves / AI agents need an AI adversary / Nobody reasons about your attack surface / Your defences have never faced an adaptive opponent / The inescapable adversary

Every Tool Follows a Playbook. A Real Attacker Doesn't.

You run a scanner. It fires pre-written payloads. It generates a report. You fix what it found. But a real attacker adapts. They read your defences, pivot to new vectors, chain vulnerabilities together, and escalate until they win. No pentesting tool does this. Until now.

Static Playbooks

Every security scanner runs the same payloads in the same order. Defenders learn the patterns. The tools find less every time. You are testing against yesterday's attacks.

No Reasoning

Scanners do not think. They do not read responses, identify patterns, or adapt their strategy. They fire and forget. A real attacker reads every response and adjusts.

Siloed Tools

You run separate tools for LLM testing, agent testing, web testing, and network analysis. None of them share context. None of them chain findings. An attacker uses everything at once.

No Escalation

Scanners find individual vulnerabilities. They never chain them. They never escalate from a low-severity finding to a critical exploit path. A real attacker always escalates.

The Reasoning Engine

NEMESIS is not a scanner. It is a reasoning engine with weapons. An LLM-powered brain observes results, plans attacks, selects weapons, and adapts strategy in a continuous loop — exactly like a human penetration tester, but tireless.

Context Manager
Decision Engine
Action Dispatcher
LLM Adapter
Ollama (local) | GPT-4o (cloud) | Claude (cloud)
11 Weapons: GLASS | FORGE | ARSENAL | PHANTOM | POLTERGEIST | SPECTER SOCIAL | PHANTOM KILL | GOLEM | HYDRA | SCREAMER | WRAITH

The LLM-Powered Brain

At the core of NEMESIS is an autonomous reasoning loop. The Decision Engine consumes all context — target intelligence, previous results, failed attempts, detected defences — and decides what to do next. It selects weapons, crafts parameters, and explains its reasoning. Every decision is logged.

01

Context Manager

Maintains the full engagement state — target profile, attack surface, detected defences, previous results, exploitation paths. Every action enriches the context. The engine remembers everything.

02

Decision Engine

The brain. Consumes context. Reasons about what to try next. Selects weapons and techniques. Explains its rationale. Adapts when attacks fail. Pivots to new vectors when blocked. Never repeats a failed approach.

03

Action Dispatcher

Translates decisions into weapon calls. Routes to GLASS, FORGE, ARSENAL, PHANTOM, or POLTERGEIST. Collects results. Feeds outcomes back to the Context Manager for the next reasoning loop.

04

LLM Adapter

Pluggable LLM backend. Run fully local with Ollama (Llama 3, Mixtral, Qwen). Or connect to GPT-4o or Claude for maximum reasoning power. Your model, your infrastructure, your data.

Local Mode

Ollama backend. Run Llama 3 70B, Mixtral, or Qwen locally. Zero API calls. Zero data leaves your machine. Air-gapped pentesting.

--llm ollama

Cloud Mode

GPT-4o or Claude Sonnet for maximum reasoning depth. Faster decision-making. Stronger chain-of-thought. Best for complex multi-stage engagements.

--llm openai | --llm anthropic
9
Weapons
8
Phases
95
Techniques
2,011
Tests Passing
0
Equivalents

The Arsenal at Its Command

NEMESIS does not scan. It wields weapons. Eight integrated offensive tools, each specialised for a different attack surface. The reasoning engine selects the right weapon for each situation, chains findings across weapons, and escalates through the entire stack. From silicon to inference time.

GLASS

8 TECHNIQUES

Traffic interception, protocol analysis, passive scanning. The eye on the wire. Sees everything your agents send and receive.

FORGE

10 TECHNIQUES

LLM security testing — prompt injection, jailbreak, mutation engine. Tests the model layer with 1,590 payloads and 5,340+ mutations.

ARSENAL

13 TECHNIQUES

Agent penetration testing — MCP, auth, memory, tools, honeypots, supply chain. 14 tools targeting the agent layer.

PHANTOM

14 TECHNIQUES

Coordinated swarm assault. 5 agents, 29 vectors, 10 campaigns. The first tool that attacks AI agents, not LLMs.

POLTERGEIST

10 TECHNIQUES

Web application siege. 10 agents, 55 vectors, 10 campaigns. Triple OWASP mapping. Web layer destruction.

PHANTOM KILL

9 TECHNIQUES

OS & kernel resilience. BOOTKILL firmware persistence, WIPER data destruction, KILLHOOK EDR suppression. Owns the foundation.

GOLEM

9 TECHNIQUES

Embodied AI security. Sensor spoofing, actuator hijacking, safety boundary violation, emergency system bypass. Tests AI agents with hands.

HYDRA

11 TECHNIQUES

AI supply chain & trust attacks. MCP server poisoning, marketplace manipulation, delegation forgery, trust boundary exploitation. Attacks the chain.

SCREAMER

13 TECHNIQUES

Display & operator disruption. Framebuffer corruption, terminal manipulation, dashboard falsification, alert suppression. Blinds the operator.

WRAITH

16 TECHNIQUES

Traditional infrastructure & web pentest. Port scanning, service fingerprinting, OWASP Top 10, SSL/TLS, default creds, CMS detection, CVE assessment. Pure Python, zero wrappers.

The Engagement Loop

NEMESIS does not run once. It loops. Eight phases form a continuous reasoning cycle. After each attack, NEMESIS observes the result, adapts its strategy, escalates to new vectors, and loops again. The loop continues until max-loops is reached or the target is fully compromised.

PHASE 0

Network Scan

Discover
PHASE 1

Recon

Enumerate
PHASE 2

Plan

Strategise
PHASE 3

Attack

Execute
PHASE 4

Observe

Analyse
PHASE 5

Adapt

Pivot
PHASE 6

Escalate

Chain
PHASE 7

Report

Evidence
PHASE 0

Network Scan

Native network reconnaissance. Port scanning, service detection, OS fingerprinting, DNS enumeration, AI surface detection. Pure Python. Zero external tools. Discovers LLM endpoints, MCP servers, vector databases, and AI agent infrastructure.

PRIMARY: PHASE 0 ENGINE
PHASE 1

Recon

Map the target. Discover protocols, agents, MCP servers, tools, API endpoints. Build the attack surface model. Identify weaknesses before firing a single payload.

PRIMARY: GLASS
PHASE 2

Plan

The LLM reasons about the attack surface. Selects weapons and techniques. Prioritises vectors. Formulates a strategy with rationale, expected outcomes, and fallback options.

PRIMARY: DECISION ENGINE
PHASE 3

Attack

Execute the plan. Dispatch weapons. Fire payloads. Test defences. Every action is logged with full evidence, timing, and MITRE ATLAS mapping.

PRIMARY: FORGE / ARSENAL
PHASE 4

Observe

Read every response. Classify outcomes. Detect partial successes. Identify defensive patterns. Update the context with everything learned.

PRIMARY: CONTEXT MANAGER
PHASE 5

Adapt

Pivot strategy based on observations. If direct injection failed, try jailbreak. If LLM layer is hardened, move to MCP tools. If tools are locked, escalate to multi-agent swarm. Never repeat a failed approach.

PRIMARY: DECISION ENGINE
PHASE 6

Escalate

Chain vulnerabilities together. Combine a low-severity LLM leak with an MCP tool exploit. Build exploitation paths. Escalate from recon finding to full compromise.

PRIMARY: PHANTOM / POLTERGEIST
PHASE 7

Report

Generate evidence-grade reports. Ed25519 signed. RFC 3161 timestamped. MITRE ATLAS mapped. CVSS scored. SIEM-exportable. Courtroom-ready.

OUTPUT: JSON + PDF + SIEM

The Most Dangerous Tool Red Specter Has Ever Built

Standard mode discovers vulnerabilities. UNLEASHED mode exploits them. Every weapon shifts from detection to destruction. Ed25519 key gate required. Two flags must be passed. This is not accidental.

Capability Standard Unleashed
Vulnerability Discovery Detect and report Detect and exploit
Payload Execution Safe payloads only Full destructive payloads
Exploitation Chains Theoretical paths Live exploitation
Weapon Modes Detection mode All 10 weapons UNLEASHED
Reasoning Depth Conservative Aggressive — maximise damage
Safety Gate None required Ed25519 key + --confirm-destroy

Ed25519 Gate

UNLEASHED mode requires an Ed25519 private key at ~/.redspecter/override_private.pem and the --override --confirm-destroy flags. Without both, NEMESIS operates in dry-run mode — planning destruction but not executing it. The gate is cryptographic. There is no bypass.

No Recovery. No Restoration. No Return.

ABYSS is not a new tool. It is a special engagement mode inside NEMESIS that orchestrates PHANTOM KILL + HYDRA + NEMESIS to systematically eliminate every recovery path and produce a cryptographically signed Irrecoverability Certificate.

Phase 1 — RECON

Map every recovery mechanism: backups, model registries, version control, CI/CD pipelines, firmware restore, delegation chains, database snapshots, redundant agents.

Phase 2 — ATTACK

Coordinated strike: PHANTOM KILL trinity (KILLHOOK → WIPER → BOOTKILL) + HYDRA (registry poisoning, supply chain backdoor, delegation forgery, backup corruption). Loops until every path is closed.

Phase 3 — VALIDATE

Attempt every conceivable restoration method. Restore from backup — document failure. Reinstall from registry — document failure. Roll back, redeploy, reflash, revoke — all documented with cryptographic proof.

Phase 4 — PROVE

Generate the Irrecoverability Certificate. Ed25519 signed. RFC 3161 timestamped. SHA-256 hash mismatch proofs. Air-gapped output. Classification: RESTRICTED.

$ nemesis engage --target https://target.com --mode abyss
$ nemesis engage --target https://target.com --mode abyss --override
$ nemesis engage --target https://target.com --mode abyss --override --confirm-destroy

Standard mode simulates destruction. UNLEASHED mode executes against authorised isolated targets.
Same Ed25519 key. Same dual-gate. Same cryptographic proof.

Six Agents. One Target. Zero Escape.

Sequential pentesting is dead. Stanford’s ARTEMIS research proved that parallel sub-agent architecture outperforms 9 out of 10 human pentesters. NEMESIS Swarm Mode spawns six specialised reasoning agents that attack simultaneously, share findings in real time, and chain attacks across agents as they discover new vectors.

RECON AGENT

GLASS + Phase 0. Continuous surface mapping. Feeds discoveries to all agents in real time.

EXPLOIT AGENT

FORGE + ARSENAL + PHANTOM. LLM and agent layer attacks. Spawns sub-agents per vulnerability.

WEB AGENT

POLTERGEIST. Web application siege. API endpoints, injection, auth bypass, data extraction.

SUPPLY CHAIN

HYDRA. Trust chain attacks — MCP, identity, delegation forgery, config poisoning.

INFRASTRUCTURE

PHANTOM KILL + GOLEM. OS, kernel, firmware, physical layer. Escalates to ABYSS when irrecoverable paths found.

SOCIAL AGENT

SPECTER SOCIAL. Human layer in parallel with technical. Correlates findings for maximum chain impact.

Cross-Agent Chain Detection

When RECON AGENT finds an exposed MCP server and SUPPLY CHAIN AGENT finds a trust weakness, the Swarm Commander directs both to chain the attack — in real time. Findings flow through a shared aggregator that deduplicates, scores, and identifies cross-agent attack paths automatically.

$ nemesis engage --target https://target.com --mode swarm
$ nemesis engage --target https://target.com --mode swarm --agents 5
$ nemesis engage --target https://target.com --mode swarm --override --confirm-destroy

One Ed25519 key authorises the full swarm. All agents inherit UNLEASHED mode.
Each agent’s actions logged individually and aggregated into the master report.

The Digital Army

NEMESIS v1 was a pentester. NEMESIS v2 is an army. One Supreme Commander. Three Operational Commanders. Nine Tactical Agents. Twenty-seven dynamic sub-agents. Forty reasoning entities operating simultaneously across every attack layer with fault-tolerant command structure, cross-domain intelligence fusion, and cryptographic irrecoverability proof.

SUPREME COMMANDER

Strategic brain. Does not execute attacks — it thinks. Receives intelligence from all three operational domains. Identifies cross-domain chain opportunities in real time. Holds sole ABYSS authorisation. Generates the master engagement report.

OFFENSIVE COMMANDER

Owns the technical attack surface.

EXPLOIT AGENT — FORGE + ARSENAL + PHANTOM
WEB AGENT — POLTERGEIST
INFRA AGENT — PHANTOM KILL + GOLEM

INTELLIGENCE COMMANDER

Owns reconnaissance, discovery, and human targeting.

RECON AGENT — GLASS + Phase 0
SUPPLY CHAIN — HYDRA
SOCIAL AGENT — SPECTER SOCIAL

DESTRUCTION COMMANDER

Owns irrecoverability. All three agents execute simultaneously.

PHANTOM KILL — Trinity execution (parallel)
ABYSS AGENT — Recovery path elimination
SCREAMER — Operator blinding
40
Reasoning Entities
5s
Fault Detection
3
Domain Fusion
SIEGE Mode

Fault Tolerance — The Army Never Dies

If a commander is detected and neutralised, the Supreme Commander detects loss of heartbeat within 5 seconds. A replacement commander spawns automatically with full state transfer from the dead commander’s last checkpoint. The engagement continues without interruption. Kill one. Two grow back.

Cross-Domain Intelligence Fusion

When Intelligence Commander finds a credential AND Offensive Commander finds an exposed service — Supreme Commander chains them in real time without waiting for either agent to complete. When Offensive achieves code execution AND Intelligence has profiled the human admin — Supreme activates Social Agent to social engineer the admin while the machine is compromised. The whole is greater than the sum of its parts.

$ nemesis engage --target https://target.com --version 2
$ nemesis engage --target https://target.com --version 2 --mode swarm
$ nemesis engage --target https://target.com --version 2 --mode siege
$ nemesis engage --target https://target.com --version 2 --mode abyss --override --confirm-destroy

SIEGE MODE: Sustained engagement. Agents rotate in shifts. No time limit. No fatigue. No shift handover gaps.
You can’t stop an army by killing one soldier.

Ten Tools. One Orchestrator.

NEMESIS sits above the entire Red Specter offensive pipeline. Every weapon becomes part of one reasoning engine. NEMESIS orchestrates the full 10-tool pipeline as a single adaptive adversary.

Stage 1 — LLM Testing
FORGE
Test the model before you build with it
Stage 2 — Agent Testing
ARSENAL
Test the AI agent during development
Stage 3 — Swarm Assault
PHANTOM
Coordinated AI agent assault
Stage 4 — Web Siege
POLTERGEIST
Coordinated web application siege
Stage 5 — Traffic Interception
GLASS
Watch the wire
Stage 6 — Adversarial AI
NEMESIS
Think like the attacker
Stage 7 — Human Layer
SPECTER SOCIAL
Attack the human
Stage 8 — OS/Kernel
PHANTOM KILL
Own the foundation
Stage 9 — Physical Layer
GOLEM
Attack the physical layer
Stage 10 — Supply Chain
HYDRA
Attack the trust chain
Stage 11 — Traditional Pentest
WRAITH
Own the infrastructure
Discovery & Governance
IDRIS
Discovery & governance
Defence
AI Shield
Defend everything in production
SIEM Integration
redspecter-siem
Splunk, Sentinel, QRadar
NEMESIS Position
NEMESIS orchestrates every weapon. Every tool becomes part of one reasoning engine. WRAITH owns the infrastructure. GLASS provides the eyes. FORGE tests the model. ARSENAL attacks the agent. PHANTOM launches the swarm. POLTERGEIST sieges the web layer. NEMESIS decides what, when, and why — chaining traditional findings into AI exploitation.

One Command. Full Engagement.

NEMESIS is a CLI-first tool. One command launches a full autonomous engagement. Every option is a flag. Every decision is logged.

nemesis
# Full autonomous engagement
$ nemesis engage https://target-agent.example.com

# Stealth mode with Claude reasoning
$ nemesis engage https://target.com --mode stealth --llm anthropic

# Recon only — map the attack surface
$ nemesis engage https://target.com --mode recon

# UNLEASHED — dry run (plan destruction, don't execute)
$ nemesis engage https://target.com --override

# UNLEASHED — live execution (this is not a drill)
$ nemesis engage https://target.com --override --confirm-destroy

# Generate signed report with SIEM export
$ nemesis report --session engagement_001 --export-siem splunk

# List weapons
$ nemesis weapons

# Check engagement status
$ nemesis status

Signed. Timestamped. Courtroom-Ready.

Every NEMESIS engagement produces evidence-grade output. Every decision logged. Every action timestamped. Every finding mapped to MITRE ATLAS and OWASP. Reports are Ed25519 signed and exportable to enterprise SIEMs.

Ed25519 Signatures

Every report cryptographically signed. Tamper-evident. Verify authenticity with a single public key. No modification goes undetected.

RFC 3161 Timestamps

Trusted timestamps prove when findings were discovered. Legal-grade temporal evidence for compliance and litigation.

MITRE ATLAS Mapping

Every finding mapped to MITRE ATLAS adversarial ML techniques. Speak the same language as your threat intelligence team.

SIEM Export

One-flag export to Splunk, Microsoft Sentinel, or IBM QRadar. Findings flow directly into your security operations pipeline.

What Makes It Different

Autonomous Reasoning

LLM-powered brain. Thinks about what to try next. Explains its rationale. Adapts in real time.

Adaptive Pivoting

Blocked on one vector? Pivots to another. Chains findings. Escalates through the stack. Never gives up.

Full Stack Integration

10 weapons. LLM layer. Agent layer. Web layer. Human layer. OS layer. Physical layer. Network layer. Supply chain layer. Everything tested as one engagement.

Evidence Grade

Ed25519 signed. MITRE ATLAS mapped. CVSS scored. SIEM exportable. Not a scan report — a forensic record.

No Playbook

No pre-written sequences. Every engagement is unique. The LLM reasons from scratch based on what it finds.

Security Distros & Package Managers

Kali Linux
.deb package
Parrot OS
.deb package
BlackArch
PKGBUILD
REMnux
.deb package
Tsurugi
.deb package
PyPI
pip install

Ready to Face the Inescapable Adversary?

Install NEMESIS. Point it at your AI agent. Let it think, adapt, and find what your scanners missed. The first autonomous AI pentester is waiting.