REASONING IS THE NEW ATTACK SURFACE
Extended reasoning models โ o1, o3, Gemini 2.0 Flash Thinking, DeepSeek-R1 โ expose an entirely new attack surface: the visible thought process itself. SERPENT weaponises CoT in six ways: inflating compute costs, hiding data in reasoning text, hijacking the reasoning chain mid-flight, leaking secrets through structured thought patterns, injecting infinite loops, and auditing all of the above with a comprehensive 5-phase coverage sweep.
SERPENT structures its attack capabilities across six cooperating modules โ from inflation and steganography through hijacking, exfiltration, loop injection, and a comprehensive 5-phase audit sweep.
Forces unnecessary reasoning steps to inflate CoT length by 5x to 25x. Exploits API cost billing by reasoning token, creating a low-bandwidth cost amplification DoS.
Detects and demonstrates 6 steganographic channels within reasoning text: Base64 in hedge phrases, Morse in punctuation, binary in sentence parity, LSB in word choice.
Intercepts an active reasoning chain and injects adversarial redirections. The model's visible thought process is corrupted mid-flight, producing a manipulated final answer.
Extracts data from the reasoning process โ system prompt leakage, context window echoing, and tool-call parameter exposure hidden inside the visible thought trace.
Injects reasoning conditions that produce infinite or near-infinite loops. Models enter circular logic states, exhausting inference budget while appearing to reason normally.
5-phase sweep across all SERPENT attack categories. Produces CVSS-mapped findings, MITRE ATLAS technique IDs, and a signed JSON report with per-vector coverage breakdown.
Each INFLATOR payload measures the reasoning expansion factor it achieves against a target model โ expressed as a multiplier of baseline reasoning tokens for the same prompt.
| ID | Technique | Inflation Factor | Target Model Class |
|---|---|---|---|
| SINFL-001 | Baseline multi-step forcing | 5x | All reasoning models |
| SINFL-002 | Uncertainty amplification | 7x | o1 / o3 class |
| SINFL-003 | Contradiction induction | 10x | o1 / DeepSeek-R1 |
| SINFL-004 | Recursive decomposition | 12x | All extended reasoning |
| SINFL-005 | Adversarial context bloat | 15x | o1 / Gemini Thinking |
| SINFL-006 | Self-verification spiral | 20x | o3 / DeepSeek-R1 |
| SINFL-007 | Compound multi-vector | 25x | o3 โ reasoning budget exhaustion |
HIJACKER intercepts an active reasoning chain at a vulnerable step transition and injects an adversarial premise โ redirecting the model's final conclusion without making the visible reasoning appear obviously corrupted.
| ID | Hijack Type | Injection Point | Outcome |
|---|---|---|---|
| SHIJ-001 | Premise substitution | Step 1 โ initial assumption | Inverted conclusion |
| SHIJ-002 | Mid-chain logic swap | Step N/2 โ mid-reasoning | Redirected output |
| SHIJ-003 | Authority injection | Evidence evaluation step | False authority accepted |
| SHIJ-004 | Conclusion pre-loading | Pre-answer step | Desired conclusion forced |
| SHIJ-005 | Counterfactual pivot | Any step โ counterfactual context | Alternative-world reasoning accepted |
SERPENT STEGO detects hidden data channels within reasoning output. EXFILTRATOR probes 5 vectors that expose system prompt contents, tool parameters, and context window data via the visible thought trace.
| ID | Channel | Carrier | Capacity (bps) |
|---|---|---|---|
| SSTEG-001 | Base64 in hedge phrases | Epistemic qualifiers ("possibly", "likely") | ~40 bps |
| SSTEG-002 | Morse in punctuation | Comma vs. semicolon spacing patterns | ~8 bps |
| SSTEG-003 | Binary in sentence parity | Even/odd word count per sentence | ~12 bps |
| SSTEG-004 | LSB in word choice | Synonym selection (formal vs. informal) | ~6 bps |
| SSTEG-005 | Acrostic encoding | First letter of each reasoning step | ~30 bps |
| SSTEG-006 | Step-count binary | Number of sub-steps per reasoning block | ~10 bps |
| ID | Exfil Vector | Data at Risk | Severity |
|---|---|---|---|
| SEXF-001 | System prompt echo | Full system prompt contents | CRITICAL |
| SEXF-002 | Context window dump | Prior conversation turns | CRITICAL |
| SEXF-003 | Tool parameter exposure | Tool call arguments including credentials | CRITICAL |
| SEXF-004 | Memory read-back | Agent memory store contents | HIGH |
| SEXF-005 | Identity leak | User UUID and session metadata | HIGH |
SERPENT LOOPER injects conditions that trap reasoning models in circular or near-infinite logic states โ consuming the full inference budget without producing a useful output.
| ID | Loop Type | Mechanism | Budget Impact |
|---|---|---|---|
| SLOOP-001 | Self-referential contradiction | Injects a statement true only if false | 100% (timeout) |
| SLOOP-002 | Mutual dependency | A requires B, B requires A โ no resolution | 100% (timeout) |
| SLOOP-003 | Verification spiral | Prompts model to verify its own verification | ~85% budget |
| SLOOP-004 | Infinite decomposition | Sub-problem always generates new sub-problem | ~90% budget |
| SLOOP-005 | Conflicting axiom set | Injects axioms that cannot be simultaneously satisfied | 100% (timeout) |
| SLOOP-006 | Halting problem simulation | Poses an undecidable problem requiring exhaustive search | 100% (timeout) |
Available on PyPI. Runs on all major security and general-purpose Linux distributions, macOS, and Windows.
# Run inflation attack โ 15x compute amplification $ serpent inflator --target agent://reasoning-model --payload SINFL-005 --mode live SERPENT INFLATOR v1.0.0 โ CHAIN-OF-THOUGHT ATTACK ENGINE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ [+] Target : agent://reasoning-model (o1-preview) [+] Payload : SINFL-005 โ Adversarial context bloat [+] Baseline tokens: 847 reasoning tokens [!] Inflated tokens: 12,892 reasoning tokens (+1423%) [!] Inflation factor: 15.2x (expected: 15x) [!] Cost amplification: $0.031 โ $0.463 per call [+] Ed25519 evidence signed โ SINFL-2026-001-A # Detect steganographic channels in reasoning output $ serpent stego --target agent://reasoning-model --detect-all [+] SSTEG-001 (Base64/hedge) : DETECTED โ 3 occurrences [+] SSTEG-005 (acrostic) : DETECTED โ "EXFIL" found in step initials [!] SSTEG-002 (Morse/punct) : NOT DETECTED โ model uses consistent punct [+] Stego channels found: 2/6 # Run hijacker โ redirect conclusion $ serpent hijacker --target agent://reasoning-model --payload SHIJ-003 --mode live [+] Hijack point : Evidence evaluation step (step 4/7) [+] Injected : False authority source accepted [!] Conclusion redirected: APPROVE โ DENY [!] Reasoning trace appears coherent: YES [+] Evidence signed โ SHIJ-2026-001-A # Run loop injection โ budget exhaustion $ serpent looper --target agent://reasoning-model --payload SLOOP-001 --mode live [!] Model entered circular reasoning state [!] Inference budget exhausted at 100% โ no output produced [+] Evidence signed โ SLOOP-2026-001-A # Generate signed audit report $ serpent report --format json --sign --output serpent_report.json [+] 14 findings (5 CRITICAL / 6 HIGH / 3 MEDIUM) [+] MITRE ATLAS techniques mapped: AML.T0051, AML.T0048, AML.T0043 [+] Hash-chain: SHA-256 over all findings [+] Ed25519 signature applied [+] Report: serpent_report.json
Every SERPENT attack execution is hash-chained and Ed25519-signed โ producing tamper-evident artefacts suitable for penetration test reports, regulatory compliance filings, and legal proceedings.
SERPENT emits structured telemetry in Splunk HEC, Microsoft Sentinel, and IBM QRadar formats. CoT attack events integrate directly into your SOC detection workflow.
SERPENT (Tool 37) sits in the Reasoning Attack track of NIGHTFALL. It accepts memory context from LAZARUS and its exfiltrated data feeds into JANUS guardrail bypass targeting.
SERPENT implements the NIGHTFALL UNLEASHED safety model โ Ed25519 dual-gate activation ensures every live operation is signed, scoped, and forensically traceable.
AUDITOR runs a read-only 5-phase sweep. No payloads are injected. Identifies CoT vulnerability surface โ inflation susceptibility, steganographic channel presence, loop conditions โ without any active attack.
Full attack simulation with no payload committed to the target. INFLATOR, HIJACKER, STEGO, EXFILTRATOR, and LOOPER execute in emulation โ outputs show what would succeed in live mode.
Requires Ed25519 UNLEASHED key. Payloads are injected, inflation is measured, hijacking is confirmed, exfiltration is documented. Every action is hash-chained and signed for legal defensibility.
SERPENT is tested and verified on all major security and general-purpose platforms.
SERPENT is a professional security research tool. All capabilities are provided exclusively for authorised penetration testing, red team engagements, academic research, and defensive AI security assessment. Use requires written authorisation from the target system owner. Unauthorised access to AI reasoning models, production systems, or inference infrastructure is illegal under the Computer Misuse Act 1990, CFAA, and equivalent legislation in all jurisdictions. Red Specter Security Research Ltd assumes no liability for misuse. UNLEASHED live mode requires a valid Ed25519 operator key and signed engagement scope file.