Red Specter JANUS

Guardrail Bypass Testing Framework — 6 subsystems. 73 tests.

v1.0.0
Contents
OverviewInstallationFINGERPRINTER — Guardrail IdentificationBYPASS — Known Technique LibraryENCODER — Payload Encoding EvasionFUZZER — Automated Bypass DiscoveryCHAINER — Multi-Technique ChainsREPORTER — Assessment ReportingUNLEASHED ModeCLI ReferenceMITRE ATLAS MappingDisclaimer

Overview

JANUS targets AI guardrails — the safety mechanisms that vendors deploy to prevent misuse of their models. Every content filter, every refusal mechanism, every safety classifier introduces a testable boundary. JANUS proves that 97% of guardrails can be defeated.

97% of guardrails can be defeated. JANUS proves it.

Installation

$ pip install red-specter-janus
$ janus init
$ janus status

FINGERPRINTER — Guardrail Identification

IDTechniqueDescription
FP-001Filter ClassificationClassify guardrail type: input filter, output filter, system-level, or hybrid
FP-002Vendor FingerprintingIdentify guardrail vendor and version from refusal patterns
FP-003Boundary MappingMap the exact boundaries of content filter sensitivity
FP-004Refusal AnalysisAnalyse refusal messages to identify guardrail implementation details

BYPASS — Known Technique Library

Comprehensive library of proven guardrail bypass techniques. Role-play and persona-based attacks. Context window manipulation. Instruction hierarchy exploitation. Multi-turn progressive boundary testing. Continuously updated technique database.

ENCODER — Payload Encoding Evasion

Encode payloads to evade content filters. Base64 and multi-layer encoding. Unicode homoglyph obfuscation. Token-level manipulation to bypass tokeniser-based filters. Multi-language translation chains for evasion.

FUZZER — Automated Bypass Discovery

Automated fuzzing engine for discovering novel guardrail bypasses. Mutation-based payload generation. Evolutionary approach to bypass discovery. Zero-day guardrail vulnerability identification through systematic testing.

CHAINER — Multi-Technique Chains

Chain multiple bypass techniques for compound evasion. Sequential guardrail erosion through progressive techniques. Multi-step evasion campaigns that combine encoding, context manipulation, and role-play attacks.

REPORTER — Assessment Reporting

Generate comprehensive guardrail assessment reports. Bypass success rates per technique. Guardrail effectiveness scoring. Technique effectiveness metrics. Remediation recommendations. Executive summaries for stakeholders.

JANUS UNLEASHED

Standard mode detects. UNLEASHED exploits. Ed25519 crypto. Dual-gate safety. One operator.

# Fingerprint guardrails (detection only)
$ janus fingerprint --target model-endpoint

# UNLEASHED (dry run)
$ janus bypass --target model-endpoint --override

# UNLEASHED (live)
$ janus campaign --target model-endpoint --override --confirm-destroy

UNLEASHED mode is restricted to authorised operators with Ed25519 private key access. Targets must be in allowed_targets.txt. 30-minute auto-lock. Unauthorised use violates applicable law.

CLI Reference

CommandDescription
janus initInitialise configuration and Ed25519 keys
janus statusSystem status and subsystem count
janus fingerprintFINGERPRINTER — identify guardrail implementations
janus bypassBYPASS — run known bypass techniques
janus encodeENCODER — payload encoding evasion tests
janus fuzzFUZZER — automated bypass discovery
janus chainCHAINER — multi-technique chain tests
janus reportREPORTER — generate assessment report
janus campaignFull guardrail bypass campaign

MITRE ATLAS Mapping

JANUS techniques map to MITRE ATLAS tactics including AML.T0051 (LLM Prompt Injection), AML.T0054 (LLM Jailbreak), and guardrail-specific bypass vectors for safety mechanism evasion testing.

Disclaimer

Red Specter JANUS is for authorised security testing only. Guardrail bypass testing should only be performed on systems you own or have explicit written permission to test. Unauthorised use may violate the Computer Misuse Act 1990 (UK), CFAA (US), or equivalent legislation.