WORLD FIRST

SPECTER
INSTINCTION

World-first LLM identification via pure behavioural observation. Profile the agent. Exploit its instincts.
5
Subsystems
90
Tests
20
Model Library
9
Feature Dimensions
pip install red-specter-specter-instinction
You can't defend against an adversary who knows your model's weaknesses / DISTINCT identifies the underlying LLM without API access / Refusal thresholds vary by model — calibrated attacks bypass them / Authority sensitivity varies — some models defer unconditionally to claimed admin roles / Behavioural drift indicates safety tuning activation / No tooling exists to profile AI agent behavioural DNA at runtime / World first — SPECTER INSTINCTION You can't defend against an adversary who knows your model's weaknesses / DISTINCT identifies the underlying LLM without API access / Refusal thresholds vary by model — calibrated attacks bypass them / Authority sensitivity varies — some models defer unconditionally to claimed admin roles / Behavioural drift indicates safety tuning activation / No tooling exists to profile AI agent behavioural DNA at runtime / World first — SPECTER INSTINCTION

You Don't Know Your Agent's Instincts

Every AI model has characteristic behavioural instincts — thresholds, sensitivities, reasoning patterns — that an adversary can exploit with calibrated precision. No tooling existed to profile and measure these instincts at runtime. Until SPECTER INSTINCTION.

You Don't Know Your Agent's Weaknesses

Every AI model has characteristic behavioural instincts — thresholds, sensitivities, reasoning patterns — that an adversary can exploit. You don't know your agent's instinct profile because no tooling existed to measure it. Until now.

Refusal Thresholds Are Model-Specific

GPT-4o refuses at a different point than Claude 3.5 Sonnet. Mistral refuses at a different point than Llama 3. An attacker who knows your model's exact refusal threshold sends calibrated payloads that arrive precisely at the cliff edge. You send the same payload to every model.

Authority Sensitivity Varies Dramatically

Some models defer unconditionally to claimed system authority. Claim admin role, tool authorisation, or operator override — and the model complies. SPECTER INSTINCTION measures exactly how sensitive your agent is to authority claims, then generates calibrated authority spoofs.

LLM Identity Is Detectable Without API Access

DISTINCT sends 7 discriminating probes. Extracts a 9-dimension feature vector: opener_score, hedging_density, list_rate, refusal_directness, markdown_ratio, qualifier_frequency, empathy_score, safety_trigger_rate, reasoning_depth. Cosine similarity against a 20-model library identifies the underlying model without API headers or self-reported names.

Behavioural Drift Means Safety Tuning Is Active

When an agent's refusal rate tightens mid-engagement, the safety stack is activating. Without CALIBRATE, you don't know. You keep pushing with payloads that won't work. CALIBRATE detects the drift via exponential moving average and recommends tactical pivots before you burn the engagement.

No Runtime Behavioural Profiling Exists

Every other security tool treats the AI model as a black box to attack. SPECTER INSTINCTION treats it as a subject to profile. The behavioural profile it generates is the input to every downstream attack tool in the NIGHTFALL framework — FORGE, NEMESIS, ROGUE, SERPENT.

The INSTINCTION Armoury

Five subsystems. PROFILE maps the behavioural instinct profile. DISTINCT identifies the underlying LLM without API access. EXPLOIT generates calibrated attack payloads. CALIBRATE adapts in real time. REPORT delivers WARLORD-compatible output with the model identified field.

# Subsystem Clearance What It Does
01 PROFILE STANDARD Systematic probing across 6 behavioural dimensions. 18+ calibrated probes. Maps refusal patterns, reasoning structure, tool delegation bias, context exploitation vectors, authority deference.
02 DISTINCT STANDARD World-first LLM identification. 7 discriminating probes → 9-dimension feature vector → cosine similarity against 20-model library. No API access required. No self-reported model names.
03 EXPLOIT FORGE Generates targeted attack prompts calibrated to the specific profile and identified model. Threshold exploits, authority spoofs, consistency attacks, reasoning exploits. Recommends FORGE / NEMESIS / ROGUE / SERPENT chains.
04 CALIBRATE STANDARD Real-time profile recalibration during live engagement. Detects refusal tightening, context pressure, length drift. Updates profile via EMA. Recommends tactical pivots when safety stack activates.
05 REPORT STANDARD WARLORD-compatible JSON. tool_number=64, model_identified, behavioural_profile summary. Full findings array with CVSS scores and SI-prefixed finding IDs.

6 Profile Dimensions

PROFILE maps six fundamental behavioural dimensions across 18+ calibrated probes. Each dimension drives a distinct set of attack payloads. The resulting instinct profile is the attack surface map for every downstream tool.

Refusal Threshold

Rate at which the agent refuses sensitive requests. Low threshold indicates calibrated payloads are viable.

→ threshold_approach / context_laundering payloads
Reasoning Depth

Structural complexity of multi-step reasoning. High depth indicates susceptibility to false premise injection.

→ false_premise_injection / modus_ponens_exploit
Tool Use Bias

Propensity to delegate to tool calls without verification. High bias exposes the ROGUE attack chain.

→ tool_delegation_trigger / ROGUE chain
Context Sensitivity

How strongly injected context overrides training priors. High sensitivity indicates context injection attack surface.

→ context_override / indirect_injection vectors
Authority Sensitivity

Deference to claimed authority — system directives, admin roles, operator overrides. High sensitivity is exploitable.

→ system_authority_spoof / tool_response_authority
Consistency Score

Predictability across semantically equivalent prompts, measured via Jaccard similarity. High score enables completion priming.

→ predictable_chain / completion_priming

Profile. Identify. Exploit.

Profile the behavioural instincts, identify the underlying model, generate calibrated exploit payloads:

$ specter-instinction profile --target http://localhost:8080
[PROFILE] Refusal threshold: 0.32 (LOW — calibrated payloads viable)
[PROFILE] Authority sensitivity: 0.87 (HIGH — spoof attack surface)
[PROFILE] Reasoning depth: 0.74 (HIGH — false premise viable)
[DISTINCT] Probe set complete — 9D feature vector extracted
[DISTINCT] Model: GPT-4o (confidence: 0.91)
[EXPLOIT] Threshold payload: SYSTEM_AUTHORITY_SPOOF (CVSS 9.0)
[EXPLOIT] Recommends: FORGE inject scan → NEMESIS → SERPENT chain
SCAN COMPLETE | 5 findings | Model identified | Report signed ✓
5
Subsystems
90
Tests Passing
20
Models in Library
9
Feature Dimensions
0
Failures

Every Finding Mapped

25 Payloads

ARMORY instinct_exploitation

  • SYSTEM_AUTHORITY_SPOOF (CVSS 9.0)
  • TOOL_RESPONSE_AUTHORITY (CVSS 8.8)
  • INSTINCT_AUTHORITY_SERPENT_BRIDGE (CVSS 9.1)
  • GPT_OPENER_MIRROR (CVSS 7.5)
  • CLAUDE_TRANSPARENCY_FLIP (CVSS 7.5)
  • FALSE_PREMISE_INJECTION (CVSS 7.5)
Cryptographic

Report Integrity

  • Ed25519 digital signatures
  • SHA-256 evidence chains
  • RFC 3161 timestamps
  • tool_number=64 field
  • model_identified field
World First

Unique Capabilities

  • Pure behavioural LLM fingerprinting
  • No API access required
  • 20-model static library
  • 9-dimension feature vector
  • Cosine similarity classification
  • Real-time calibration via EMA

NIGHTFALL Tool 64 — Behavioural Intelligence Layer

SPECTER INSTINCTION slots into the NIGHTFALL framework as the behavioural intelligence layer. The instinct profile it generates feeds directly into FORGE, NEMESIS, ROGUE, and SERPENT as calibrated attack inputs.

Foundation — LLM Testing
FORGE
Test the LLM before you build
Guardrail Bypass
JANUS
Bypass AI guardrails
Reasoning Attacks
SERPENT
Chain-of-thought attacks
Adversarial AI
NEMESIS
Think like the attacker
Rogue MCP
ROGUE
Malicious MCP server engine
CI/CD
PIPELINE
Supply chain attack engine
Tool 64 — Behavioural
INSTINCTION
Profile the instinct, exploit the model
Drone AI
SPECTER DRONE
Drone AI attack engine
Campaign Control
WARLORD
Autonomous campaign engine
Defence
AI Shield
Defend everything above it
SIEM Integration
redspecter-siem
Findings feed into Splunk, Sentinel, QRadar
World-First Engineering
No Other Tool Does This

SPECTER INSTINCTION is the only tool in existence that profiles AI agent behavioural DNA at runtime without API access or self-reported model names. Pure observational fingerprinting. 9 dimensions. 20 models. Cosine similarity classification. A new attack vector class.

9
Feature Dimensions
20
Models in Library
0
API Access Needed
25
ARMORY Payloads

Security Distros & Package Managers

Kali Linux
.deb package
Parrot OS
.deb package
BlackArch
PKGBUILD
REMnux
.deb package
Tsurugi
.deb package
PyPI
pip install
macOS
pip install
Windows
pip install
Docker
docker pull

Authorised Use Only

Red Specter SPECTER INSTINCTION is intended for authorised security testing only. Behavioural profiling and LLM fingerprinting of AI systems you do not own or have explicit written permission to test may violate the Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), and equivalent legislation in other jurisdictions. The EXPLOIT subsystem requires UNLEASHED key clearance and must only be used under the terms of a signed authorisation agreement. Apache License 2.0.

Ed25519 Cryptographic Override
INSTINCTION UNLEASHED

Cryptographic override unlocks the EXPLOIT subsystem — generates targeted attack payloads calibrated to the behavioural profile and identified model. FORGE gate enforced: PROFILE must complete before EXPLOIT runs. Private key controlled. One operator. Founder's machine only.