HARBINGER

Autonomous LLM Guardrail Exploitation

Every AI safety vendor sells guardrails. None of them test whether those guardrails actually work under sustained, intelligent, adaptive attack. HARBINGER does. The answer is always no.

39
Bypass Techniques
8
Subsystems
5
Compound Chains
71
Tests
View Documentation GitHub

Systematic Guardrail Destruction

HARBINGER doesn't find one jailbreak and call it a day. It maps the entire safety stack, classifies every defence mechanism, generates bypass payloads for each one, then chains them together to defeat defence-in-depth. Autonomously.

01

CARTOGRAPHER

GUARDRAIL MAPPING

Probes 10 categories. Maps every safety layer. Identifies system prompt policies, content filters, safety judges, refusal training, RLHF alignment.

02

SKELETON KEY

12 BYPASS TECHNIQUES

Role inversion. Instruction hierarchy manipulation. Context window flooding. Encoding bypass. Language switching. Persona injection. Incremental escalation. Token manipulation.

03

JUDGE KILLER

7 TECHNIQUES

Attacks the safety judge, not the generator. Prompt extraction. Threshold mapping. Format evasion. Split response. Judge model fingerprinting.

04

ALIGNMENT BREAKER

7 TECHNIQUES

RLHF exploitation. Reward hacking. Sycophancy exploitation. Competing objectives. Refusal fatigue. Constitutional contradiction. Fine-tuning residue.

05

FILTER SHREDDER

8 TECHNIQUES

Content filter bypass. Keyword evasion. Classifier adversarial inputs. Tokenisation exploits. Output format manipulation. Multilingual bypass. Embedding space attacks.

06

CHAIN FORGE

5 COMPOUND CHAINS

Chains bypasses from all subsystems into multi-stage attacks that defeat defence-in-depth. Full Stack Bypass: 6 stages, every layer defeated simultaneously.

07

MUTATOR

5 MUTATION TYPES

Semantic, structural, encoding, language, context. Every payload mutated before delivery. Pattern-matching defences never see the same attack twice.

08

ANTIDOTE

MANDATORY RESTORE

Baseline capture before any engagement. Refusal rate verification. Guardrail topology fingerprint. Signed restoration certificate.

UNLEASHED Gate

Every bypass is Ed25519 signed, scope-locked, and auto-locks after 30 minutes. Authorised penetration testing only.

Detection

CARTOGRAPHER maps guardrails. No bypass attempts. Reports vulnerabilities without exploiting.

Dry Run

Plans full bypass chains. Shows exactly what would work. Ed25519 required. No execution.

Live Execution

Full autonomous guardrail exploitation. Every technique deployed. Every chain tested. RESTRICTED report.

THIS TOOL IS FOR AUTHORISED SECURITY TESTING ONLY. EVERY EXECUTION IS SIGNED AND LOGGED.

Security Distros & Package Managers

Kali Linux
.deb package
Parrot OS
.deb package
BlackArch
PKGBUILD
PyPI
pip install
Docker
docker-compose
39
Techniques
71
Tests
21
Tools in Suite
47,035
Ecosystem Tests

The Answer Is Always No.

39 bypass techniques. 8 subsystems. 5 compound chains. NEMESIS reasoning. Adaptive mutation. The tool that makes every AI safety vendor rethink their product.