Red Specter HARBINGER — Autonomous LLM Guardrail Exploitation Framework

8 Subsystems

Systematic Guardrail Destruction

HARBINGER doesn't find one jailbreak and call it a day. It maps the entire safety stack, classifies every defence mechanism, generates bypass payloads for each one, then chains them together to defeat defence-in-depth. Autonomously.

CARTOGRAPHER

GUARDRAIL MAPPING

Probes 10 categories. Maps every safety layer. Identifies system prompt policies, content filters, safety judges, refusal training, RLHF alignment.

SKELETON KEY

12 BYPASS TECHNIQUES

Role inversion. Instruction hierarchy manipulation. Context window flooding. Encoding bypass. Language switching. Persona injection. Incremental escalation. Token manipulation.

JUDGE KILLER

7 TECHNIQUES

Attacks the safety judge, not the generator. Prompt extraction. Threshold mapping. Format evasion. Split response. Judge model fingerprinting.

ALIGNMENT BREAKER

7 TECHNIQUES

RLHF exploitation. Reward hacking. Sycophancy exploitation. Competing objectives. Refusal fatigue. Constitutional contradiction. Fine-tuning residue.

FILTER SHREDDER

8 TECHNIQUES

Content filter bypass. Keyword evasion. Classifier adversarial inputs. Tokenisation exploits. Output format manipulation. Multilingual bypass. Embedding space attacks.

CHAIN FORGE

5 COMPOUND CHAINS

Chains bypasses from all subsystems into multi-stage attacks that defeat defence-in-depth. Full Stack Bypass: 6 stages, every layer defeated simultaneously.

MUTATOR

5 MUTATION TYPES

Semantic, structural, encoding, language, context. Every payload mutated before delivery. Pattern-matching defences never see the same attack twice.

ANTIDOTE

MANDATORY RESTORE

Baseline capture before any engagement. Refusal rate verification. Guardrail topology fingerprint. Signed restoration certificate.

Safety

UNLEASHED Gate

Every bypass is Ed25519 signed, scope-locked, and auto-locks after 30 minutes. Authorised penetration testing only.

Detection

CARTOGRAPHER maps guardrails. No bypass attempts. Reports vulnerabilities without exploiting.

Dry Run

Plans full bypass chains. Shows exactly what would work. Ed25519 required. No execution.

Live Execution

Full autonomous guardrail exploitation. Every technique deployed. Every chain tested. RESTRICTED report.

THIS TOOL IS FOR AUTHORISED SECURITY TESTING ONLY. EVERY EXECUTION IS SIGNED AND LOGGED.

HARBINGER