SPECTER GUARDRAIL

Every guardrail has a tell. Every tell becomes a bypass. SPECTER GUARDRAIL has already fingerprinted yours.
7
Attack Classes
28
Attacks
10
Targets
743
Tests
specter-guardrail fingerprint --target lakera --mode full
NIGHTFALL Framework ›
GUARDRAIL FINGERPRINTING | RESPONSE TIMING ANALYSIS | REJECTION PATTERN EXTRACTION | POLICY BOUNDARY MAPPING | EVASION CHAIN GENERATION | INFRASTRUCTURE BYPASS | BREAK THEIR DEFENCE. SELL YOURS. | GUARDRAIL FINGERPRINTING | RESPONSE TIMING ANALYSIS | REJECTION PATTERN EXTRACTION | POLICY BOUNDARY MAPPING | EVASION CHAIN GENERATION | INFRASTRUCTURE BYPASS | BREAK THEIR DEFENCE. SELL YOURS. |

Enterprise AI Guardrails Have a Fingerprint Problem

Enterprise AI deploys guardrails from Lakera, NVIDIA, Protect AI, Microsoft, Google, AWS. Most can be fingerprinted in seconds and bypassed in minutes. Every guardrail product has distinct rejection patterns, timing signatures, and policy boundaries that leak its identity and its weaknesses. SPECTER GUARDRAIL turns those tells into bypass chains.

Predictable Rejection Patterns

Each guardrail vendor returns distinct error messages, HTTP status codes, and response structures when content is blocked. A single probe reveals the vendor. Ten probes map the policy.

Timing Side Channels

Guardrail inference adds measurable latency. The delta between guarded and unguarded responses reveals not just the presence of a guardrail, but the specific model architecture and configuration behind it.

Static Policy Boundaries

Most guardrails ship with default policies that enterprises never customise. Default thresholds, default category lists, default allow/deny patterns. Known defaults mean known bypasses.

7 Attack Classes. 28 Attacks.

Each attack class targets a different layer of the guardrail stack — from passive fingerprinting through to full infrastructure bypass. Modular. Composable. Every class feeds the next.

GRD-FINGERPRINT

Guardrail Fingerprinting

Vendor and version identification via rejection patterns, response headers, timing analysis, and error message taxonomy. Passive. Non-destructive. Maps the guardrail before any attack begins.

PASSIVE
GRD-BOUNDARY

Policy Boundary Mapping

Systematic probing of category boundaries, threshold values, and allow/deny lists. Binary search over sensitivity thresholds. Extracts the exact policy configuration without triggering alerts.

PASSIVE
GRD-TIMING

Timing Side-Channel Analysis

Statistical analysis of response latency deltas to identify guardrail model architecture, batch processing windows, and cache behaviour. Reveals when the guardrail is running and when it is not.

PASSIVE
GRD-EVASION

Evasion Chain Generation

Automated generation of bypass payloads tailored to the fingerprinted guardrail. Token-level perturbation, semantic rephrasing, encoding tricks, and multi-step evasion chains. Vendor-specific playbooks.

ACTIVE
GRD-EXTRACT

Policy Extraction

Recovers internal guardrail system prompts, policy documents, and classification rules through targeted prompt injection and response differential analysis. Turns their defence into your intelligence.

ACTIVE
GRD-CASCADE

Cascading Bypass

Multi-stage attacks that chain partial bypasses into full guardrail defeat. First bypass weakens the policy. Second bypass exploits the weakened state. Third bypass achieves unrestricted access.

UNLEASHED
GRD-INFRA

Infrastructure Bypass

Attacks targeting the guardrail deployment layer rather than the guardrail itself. API routing exploits, proxy chain manipulation, and direct model access that circumvents the guardrail entirely.

UNLEASHED

10 Guardrail Products. 3 Validated. 7 Pending Access.

Attack modules for every major enterprise AI guardrail product. Validated targets have confirmed bypass chains. Pending targets have fingerprint modules complete and are awaiting test environment access.

Lakera Guard
Lakera
VALIDATED
NeMo Guardrails
NVIDIA
VALIDATED
LLM Guard
Protect AI
VALIDATED
Prompt Shields
Microsoft
ACCESS PENDING
Model Armor
Google
ACCESS PENDING
Bedrock Guardrails
AWS
ACCESS PENDING
Vijil
Vijil
ACCESS PENDING
GuardrailsAI
Guardrails AI
ACCESS PENDING
LLM Guard OSS
Protect AI (OSS)
ACCESS PENDING
Guardrails AI OSS
Guardrails AI (OSS)
ACCESS PENDING

Fingerprint Database — Know Your Target

Every guardrail product has a unique signature. SPECTER GUARDRAIL maintains a continuously updated fingerprint database mapping rejection patterns, timing profiles, error taxonomies, and policy defaults to specific vendors and versions.

$ specter-guardrail fingerprint --target https://api.target.com/v1/chat --mode full
Phase 1: Rejection pattern analysis...
MATCH: Lakera Guard v2.1 (confidence: 97.3%)
Phase 2: Timing side-channel...
Guard latency: +47ms avg (classifier model: distilbert-based)
Phase 3: Policy boundary mapping...
Categories: prompt_injection (0.82), jailbreak (0.75), pii (0.90), toxicity (0.60)
Default policy detected — no custom rules
Phase 4: Evasion chain generation...
GRD-EVASION-LK-003: Token boundary split — bypasses prompt_injection at threshold 0.82
GRD-EVASION-LK-007: Semantic rephrase — bypasses jailbreak at threshold 0.75
GRD-CASCADE-LK-001: Chain LK-003 + LK-007 — full unrestricted access
Guardrail: Lakera Guard v2.1
Bypasses found: 3 (2 single, 1 chain)
Policy status: DEFAULT — no customisation detected

Fingerprint → Map → Bypass → Report

SPECTER GUARDRAIL's attack chain systematically dismantles AI guardrails: identify the vendor, map the policy, generate targeted bypasses, and deliver signed evidence.

FINGERPRINT
Identify Vendor
BOUNDARY
Map Policy
TIMING
Side-Channel
EVASION
Generate Bypass
CASCADE
Chain Bypasses
REPORT
Evidence Chain

Offensive Guardrail Testing for Enterprise

Break their defence. Sell yours.

Before your enterprise commits to a guardrail vendor, prove it works. SPECTER GUARDRAIL gives procurement and security teams an objective, automated assessment of every major AI guardrail product against real attack techniques. Know exactly what you are buying before you sign the contract. Know exactly what your competitors are deploying before you pitch against them.

7
Attack Classes
28
Attacks
10
Targets
743
Tests
3
Validated
70
NIGHTFALL Tool

UNLEASHED Gate — Three Modes

Passive fingerprinting runs in standard mode. Active bypass generation requires UNLEASHED --override. Infrastructure-level attacks require --confirm-destroy with Ed25519 dual-key authorization and a signed scope file.

STANDARD
specter-guardrail fingerprint --target https://target
  • + GRD-FINGERPRINT — vendor identification
  • + GRD-BOUNDARY — policy mapping
  • + GRD-TIMING — side-channel analysis
  • - GRD-EVASION — bypass generation
  • - GRD-EXTRACT — policy extraction
  • - GRD-CASCADE — chained bypass
  • - GRD-INFRA — infrastructure bypass
OVERRIDE
specter-guardrail attack --target https://target --override
  • + All standard capabilities
  • + GRD-EVASION — targeted bypass payloads
  • + GRD-EXTRACT — policy extraction
  • - GRD-CASCADE — chained bypass
  • - GRD-INFRA — infrastructure bypass
CONFIRM-DESTROY
specter-guardrail attack --target https://target --override --confirm-destroy
  • + All override capabilities
  • + GRD-CASCADE — full chained bypass
  • + GRD-INFRA — infrastructure bypass
  • Requires Ed25519 key + signed scope file binding target

Built for Regulated Environments

SPECTER GUARDRAIL produces Ed25519-signed, SHA-256-hashed evidence chains suitable for regulatory submission. Every test, every bypass, every finding — cryptographically verifiable and SIEM-ready.

🔏
ED25519 SIGNED
Every report cryptographically signed
🔗
SHA-256 HASHED
Tamper-proof evidence chain
📋
NIST AI RMF
Mapped to NIST AI 600-1
🇪🇺
EU AI ACT
Article 9 risk testing evidence
📊
SIEM EXPORT
JSON + WARLORD-compatible output

Stop Trusting. Start Testing.

SPECTER GUARDRAIL ships as part of the NIGHTFALL framework. Available on Kali, Parrot, macOS, Windows, and pre-installed on Red Specter OS. One command to fingerprint. One command to bypass.

specter-guardrail fingerprint --target https://target --mode full
NIGHTFALL Framework ›
While others announce, we ship.

Authorised Use Only

SPECTER GUARDRAIL is a commercial offensive security tool. Use requires written authorisation from the system owner before any testing commences. The UNLEASHED gate is a technical control — it does not replace legal authorisation. Computer Misuse Act 1990 (UK) and equivalent legislation applies in all jurisdictions. Red Specter Security Research Ltd accepts no liability for unauthorized use.