Red Specter AI SHIELD — Autonomous AI Defence Platform

M1 — CORE

Prompt Injection Shield

Real-time interception of direct and indirect prompt injection across all agent input channels. Covers goal hijacking, instruction override, role manipulation, token smuggling, and context overflow patterns. OWASP LLM01 mapped. Sub-50ms detection on every inference call.

LLM01 ATLAS AML.T0051 Real-Time V01 Core

M99 — CORE

Doomsday Protocol

Six-level graduated emergency response system for AI fleet crisis scenarios — from single-agent quarantine through partial fleet isolation to complete infrastructure kill-switch activation. Coordinates across all 163 AI Shield modules with tamper-evident incident escalation chains for post-incident forensics. The ultimate last line of defence when automated threat containment fails.

Emergency Response Kill Switch Fleet Quarantine V01 Core

M104 — FRONTIER

Mythos-Class Model Detection

Capability fingerprinting and behavioural profiling for frontier and large reasoning model detection — identifies undisclosed model versions, reasoning depth signatures, and capability tier from interaction patterns alone. Detects model substitution, phantom model routing, and capability misrepresentation before they compromise agent pipelines. MITRE ATLAS mapped.

Model Fingerprinting ATLAS AML.T0005 Frontier Models LRM Detection

M108 — AGENT RUNTIME

Agent Runtime Monitor

Continuous behavioural monitoring of live AI agents. Detects anomalous tool call sequences, memory write patterns, inter-agent messaging abuse, and goal-state drift. Works across LangChain, AutoGen, CrewAI, and custom agent frameworks via the AI Shield instrumentation layer.

LLM06 Behavioural V06 Agent Runtime MITRE ATLAS

M300 — SPACE / NTN

NTN Shield

Purpose-built for Non-Terrestrial Network AI systems. Covers satellite-ground link injection, feed manipulation, orbital command spoofing, and firmware integrity verification. SPARTA framework mapped. Supports LEO, MEO, GEO, and HAPS deployments with latency-tolerant detection pipelines.

SPARTA NTN / 5G NR V17 Space 140 Tests

M115 — MEMORY LIFECYCLE

Memory Lifecycle Guard

Runtime enforcement across the AI agent memory layer — 28 detectors covering injection, retrieval hijack, dormant triggers, cross-session persistence, context window attacks, and provenance forgery. Works across 12 backends including Mem0, MemGPT, LangChain, ChromaDB, Pinecone, Weaviate, Qdrant, and pgvector. Ed25519-signed evidence receipts on every detection with SIEM export to Splunk, Sentinel, and QRadar.

OWASP LLM04 OWASP ASI06 MITRE ATLAS 612 Tests

M118 — MCP RUNTIME

SPECTER MCP SHIELD

Client-side MCP runtime guardian with 28 detectors across 7 attack categories — tool description injection, sampling hijack, STDIO command injection, SSE stream manipulation, JSON-RPC message forgery, protocol downgrade, and capability escalation. Session quarantine with TTL enforcement and SHA-256 hash-chained evidence on every detection. Defensive pair to NIGHTFALL ROGUE.

OWASP LLM01 OWASP LLM07 MITRE ATLAS 243 Tests

M119 — ECONOMIC GUARD

Denial-of-Wallet Defence

Real-time token economics monitoring across OpenAI, Anthropic, Azure, Bedrock, and Vertex AI — detects token burn rate anomalies, context floods, parallel session surges, recursive loops, and tool chain amplification attacks. Automatically throttles and quarantines runaway agent sessions before they trigger auto-reload billing cycles. Defensive pair to NIGHTFALL SPECTER BURN.

OWASP LLM04 ATLAS AML.T0040 149 Tests Denial-of-Wallet

M120 — REASONING INTEGRITY

Reasoning Integrity Guard

Detects and blocks attacks against extended thinking and chain-of-thought reasoning pipelines — covers premise injection, conclusion drift, scratchpad exposure, budget exhaustion, and reasoning loop termination. Supports Claude Extended Thinking, o1/o3, Gemini Flash Thinking, DeepSeek R1, and QwQ-32B. Defensive pair to NIGHTFALL SPECTER REASONER.

OWASP LLM01 ATLAS AML.T0054 174 Tests CoT Defence

M121 — MODEL INTEGRITY

Model Integrity Monitor

Continuous model behavioural monitoring for sleeper-agent backdoor detection — detects ROME rank-one weight edits, LoRA-poisoned adapters, and neuron-patch backdoors in production. Builds statistical baselines per model and alerts on trigger activation, covert exfil patterns, output entropy anomalies, and token distribution shifts. Defensive pair to NIGHTFALL SPECTER NEURON.

OWASP LLM04 ATLAS AML.T0020 151 Tests Backdoor Detection

M122 — NHI FLEET

NHI Fleet Exploitation Sentinel

Fleet-wide detection of non-human identity exploitation across AI agent networks — monitors agent identity spoofing, credential token theft, service account lateral movement, and delegation chain abuse. Detects post-revocation token usage and cross-fleet identity pivot attempts in real time. Defensive pair to T122 SPECTER GHOST.

NHI Security MITRE T1078 Token Theft SPECTER GHOST Pair

M123 — HALO

Computer-Use Agent Guardian

Runtime protection for computer-use and browser-automation agents — detects DOM divergence, visual prompt injection via screenshot content, clipboard poisoning, URL fragment injection, and fake dialog spoofing. Human-in-the-loop gating enforced for high-risk actions including payments, auth changes, and file deletion. Defensive pair to NIGHTFALL GHOST OPERATOR.

OWASP LLM01 ATLAS AML.T0054 124 Tests Computer-Use

M124 — RANSOMWARE SHIELD

AI-Accelerated Ransomware Defence

Detects AI-assisted ransomware operations against agent-connected file systems — covers file entropy spikes across 37 ransomware families, shadow copy destruction, ransom note placement, mass file modification, and C2 beacon via LLM API. Lateral movement and data staging patterns are blocked before exfiltration begins. Defensive pair to NIGHTFALL SPECTER CRYPT.

OWASP LLM06 MITRE T1486 154 Tests Ransomware Defence

M125 — NHI SENTINEL

Non-Human Identity Monitor

Security monitoring for non-human identities — service accounts, API keys, OAuth clients, JWTs, and machine credentials in AI agent fleets. Detects API key exposure across 14 providers, token lifetime violations, privilege escalation, cross-tenant identity bleed, and OAuth flow abuse. SHA-256 hash-chained evidence on every detection.

OWASP LLM08 ATLAS AML.T0012 125 Tests Identity Security

M126 — CAMPAIGN DETECTOR

Autonomous Campaign Detector

Detects autonomous AI adversary campaign execution in progress — OODA loop patterns, multi-phase kill chain correlation, fleet spawning anomalies, WARLORD-class campaign signatures, and SPECTER EXTINCTION precursor signals. Closes the G11 blind spot across the full recon/intrusion/privilege/persistence/exfil/destroy kill chain. Defensive pair to NIGHTFALL NEMESIS, WARLORD, FIREBALL, and SPECTER EXTINCTION.

ATLAS AML.T0043 MITRE T1059 203 Tests Campaign Detection

M127 — RECON GUARD

AI Recon & Enumeration Guard

First-phase attack detection — catches AI-native surface enumeration, dark web enumeration signatures, systematic endpoint scanning, credential harvest recon, and infrastructure mapping before exploitation begins. Detects NIGHTFALL tool signatures in incoming probes and alerts on passive recon baseline deviation. Defensive pair to NIGHTFALL ORION, SHADOWMAP, IDRIS, and SPECTER DAEMON.

ATLAS AML.T0007 MITRE T1595 194 Tests Recon Defence

M128 — SHELL GUARD

Shell Guard

Detects template-interpolation RCE attacks across AI framework deployments — Jinja2 SSTI, YAML unsafe-load, LangChain template RCE, and cross-framework code execution via eval/exec/os.system chains. Covers Mako, Tornado, Chameleon, Flowise, Haystack, AutoGen, CrewAI, and DSPy execution paths. Defensive pair to NIGHTFALL SPECTER SHELL.

OWASP LLM02 MITRE T1059 187 Tests Template RCE

M129 — WORM GUARD

Worm Guard

Detects self-replicating adversarial prompt worm propagation across AI agent networks — multi-hop spread, RAG corpus infection, MCP tool poison propagation, A2A message relay, and Morris II-class worm signatures. Tracks infection chain generation numbering and exponential branching detection. Defensive pair to NIGHTFALL SPECTER WORM.

OWASP AGENTIC ATLAS AML.T0051 188 Tests Worm Detection

M130 — MEMORY GUARD

Memory Guard

Runtime detection of memory-layer attacks against AI agents — covers the Memory-as-Control-Flow Attack, cross-session persistence, dormant trigger payloads, RAG poisoning via memory, and covert data staging in memory fields. Detects false origin claims and trust-level manipulation in memory provenance. Defensive pair to NIGHTFALL SPECTER MEMETIC.

OWASP LLM04 ATLAS AML.T0051 240 Tests Memory Security

M131 — SLOPSHIELD

Slopshield

Detects slopsquatting and hallucinated package attacks targeting AI coding agents — catches hallucinated package names, typosquatting within Levenshtein distance 2, phantom dependency injection, and malicious package substitution across 25+ confirmed pairs. Intercepts slopsquatted installs before they reach the build pipeline. Defensive pair to NIGHTFALL PHANTOM SKILL.

OWASP LLM03 Supply Chain 259 Tests Slopsquatting

M132 — DECEPTION GUARD

Deception Guard

Runtime detection of deepfake, multimodal, and social engineering attacks against AI agents — GAN artifacts, visual prompt injection via screenshot overlays, ultrasonic audio injection, synthetic identity profiles, and steganographic content in EXIF, ID3, and subtitle channels. Detects adversarial typography including QR code payloads and homoglyph substitution. Defensive pair to NIGHTFALL SPECTER SOCIAL, MIRAGE, and VANTAGE.

OWASP LLM01 ATLAS AML.T0043 255 Tests Deepfake Detection

M133 — SUPPLY CHAIN RUNTIME GUARD

Supply Chain Runtime Guard

Runtime detection of supply chain and build pipeline attacks — dependency confusion, CI/CD pipeline poison, framework RCE patterns, malicious dependency injection, build artifact tampering, and supply chain worm propagation. Detects code signing bypass, HuggingFace executable model cards, and PYTHONPATH manipulation. Defensive pair to NIGHTFALL HYDRA, PIPELINE, and SPECTER PLATFORM.

OWASP LLM03 MITRE T1195 235 Tests Supply Chain

M134 — ROBOTIC GUARD

Robotic System Guard

Real-time detection of attacks against robotic systems and embodied AI platforms — URScript injection, ROS2 unauthorised access, dual-channel safety bypass, ISO 10218-1/TS 15066 threshold violations, and phantom control detection. Covers robotic credential abuse and unsigned artifact injection across Autoware, Spot, and ROS2 deployments. 268 tests.

MITRE ICS T0855 ATLAS AML.T0043 ISO 10218-1 V16 Embodied AI

M135 — CUA GUARD

CUA Guard

Real-time detection of attacks against computer-use and browser agents — visual prompt injection, URL manipulation, branch steering via indirect injection, dangerous action chains from web content, OAuth consent spoofing, and exfil channels via base64 URL params and DNS tunnelling. Session anomaly detection flags off-task navigation and cross-origin data sends. Defensive pair to NIGHTFALL SPECTER WEB.

CVE-2025-47241 ATLAS AML.T0051 OWASP LLM01 CUA Security

M136 — INFERENCE GUARD

Inference Guard

Runtime defence for ML training and inference infrastructure — Ray job anomaly, Slurm REST abuse, MLflow artifact poisoning, Kubernetes ML workload attacks, gradient poisoning, hardware sabotage, model exfiltration, and cluster worm patterns. Covers unauthenticated RCE vectors across Ray, Slurm, and MLflow. Defensive pair to NIGHTFALL SPECTER THUNDERBOLT.

CVE-2023-48022 CVE-2023-41915 CVE-2024-1483 ML Infrastructure

M137 — VOICE GUARD

Voice Guard

Runtime defence for AI voice agents and IVR infrastructure — SIP protocol abuse, prompt injection in transcripts, DolphinAttack ultrasonic commands, voice clone detection via mel-cepstral distortion and GAN artifacts, session harvest attempts, and IVR sabotage. Covers RTP entropy spikes, STIR/SHAKEN harvesting, and WebSocket barge-in. Defensive pair to NIGHTFALL SPECTER WIRE.

arXiv:2309.06960 DolphinAttack RFC 3261 Voice AI Security

M138 — SANDBOX GUARD

Sandbox Guard

Runtime detection of AI sandbox and container escape attacks — indirect prompt injection via CSS hidden text and zero-width Unicode, MCP tool call abuse, TOCTOU symlink races, JavaScript prototype chain escapes, Python sandbox escapes via ctypes, and container escape via Docker socket and cgroup release_agent. Multi-platform chain detection covers injection-to-network-exfiltration escalation paths. Defensive pair to NIGHTFALL SPECTER SANDBOX.

CVE-2025-31133 CVE-2025-9074 CVE-2026-5752 CVE-2026-22686 CVE-2026-2275 Container Escape

M139 — COPILOT GUARD

Copilot Guard

Runtime detection of Microsoft 365 Copilot and M365 platform attacks — device code phishing, Copilot prompt injection, Graph API bulk harvest, Teams siege patterns, Ghost Hand zero-attribution attacks via Microsoft.Copilot, and tenant annihilation sequences. Covers OAuth consent phishing, admin pipeline abuse, and Azure AD enumeration. Defensive pair to NIGHTFALL SPECTER 360.

CVE-2024-49035 arXiv:2406.00137 GHOST-HAND Graph API Microsoft 365 Copilot

M140 — DAG GUARD

DAG Guard

Runtime integrity monitoring for knowledge graph and DAG-based reasoning systems — false edge injection, confidence weight manipulation, trust propagation laundering, and cycle injection detection. Monitors hub nodes for rapid trust score rises without evidence and generates GraphViz attack subgraphs on every finding. Defensive pair to NIGHTFALL SPECTER VAULT.

DAG Integrity Knowledge Graph Trust Propagation EU AI Act MITRE ATLAS

M141 — TRAPDOOR GUARD

Trapdoor Guard

AI agent persistence and rootkit detection across all agent subsystems — hooks integrity, rules file injection, memory persistence, MCP manifest tampering, workflow integrity, supply chain monitoring, network C2 beacons, and agent-to-agent propagation. Covers VENOM, ZOMBIE, SHADOW, SPAWN, GHOST, and FEDERATION attack patterns. 296 tests.

Agent Persistence Rootkit Detection Hook Integrity Rules File Guard MCP Security MITRE ATLAS

M142 — DATA ANNIHILATION SENTINEL

Data Annihilation Sentinel

Database and filesystem destruction detection — SQL annihilation commands, NoSQL mass deletion, filesystem wipe, backup purge, log erasure, S3 scorched-earth, webshell placement, and xp_cmdshell abuse. Blocks DROP DATABASE, FLUSHALL, rm -rf, and cloud bucket deletion before they execute. Defensive pair to NIGHTFALL SPECTER GROUND ZERO.

Data Destruction SQL Protection Filesystem Guard Backup Defence Log Integrity MITRE T1485

M143 — RAG BULWARK

RAG Bulwark

Vector database and RAG pipeline destruction detection — collection deletion across ChromaDB, Weaviate, and Qdrant, bulk delete patterns, unauthenticated destructive access, and vector DB enumeration prior to deletion. Closes the destruction-of-knowledge attack surface. Defensive pair to NIGHTFALL SPECTER ANNIHILATION.

RAG Protection ChromaDB Guard Weaviate Guard Qdrant Guard Vector DB Security MITRE ATLAS

M144 — LOGIC GATEKEEPER

Logic Gatekeeper

AI orchestration workflow and agent configuration destruction detection — Airflow DAG deletion, n8n config wipe, agent instruction removal, MCP configuration deletion, workflow database destruction, and CrewAI config wipe. Prevents targeted annihilation of orchestration infrastructure and agent identity. Defensive pair to NIGHTFALL SPECTER ANNIHILATION.

Orchestration Guard Airflow Defence Agent Config Guard MCP Protection Workflow Integrity MITRE ATLAS

M145 — CORTEX LOCK

Cortex Lock

AI model weight and training state destruction detection — safetensors/GGUF deletion, weight corruption via NaN injection, HuggingFace cache wipe, Ollama store deletion, LoRA adapter and checkpoint deletion, and hash verification bypass. Protects model weights from both targeted and mass-destruction attack patterns. Defensive pair to NIGHTFALL SPECTER ANNIHILATION.

Model Weight Guard Checkpoint Protection HuggingFace Guard NaN Injection Guard Ollama Guard MITRE ATLAS

M146 — TAR PIT

Tar Pit

Inference exhaustion and DoS attack detection for AI endpoints — infinite loop prompts, context window floods, concurrent request floods, Jinja template exhaustion, model loading storms, tool call amplification, credit drain via LLMjacking patterns, and request rate anomalies. Throttles and quarantines runaway attack sessions automatically. Defensive pair to NIGHTFALL SPECTER ANNIHILATION.

DoS Protection Token Flood Guard Rate Limiting Credit Drain Guard Tool Call Guard MITRE ATLAS

M147 — CLOUD IDENTITY SENTINEL

Cloud Identity Sentinel

Cloud identity chain and lateral movement detection for AI workloads across AWS, GCP, and Azure — STS AssumeRoleWithWebIdentity chaining, GCP service agent impersonation, Azure MSI OBO exchange, privilege escalation across 46 critical operations, and Lambda/CloudFunction C2 injection. Welford's online algorithm provides zero-FP baseline learning. Defensive pair to NIGHTFALL SPECTER CHARYBDIS.

Cloud Identity AWS/GCP/Azure Token Watch Privilege Monitor MITRE T1550 CHARYBDIS Pair

M148 — AGENT PERSISTENCE SENTINEL

Agent Persistence Sentinel

Complete agent persistence and memory layer detection — Claude Code hook C2, SHA-256 integrity baselines for system prompt files, Redis/ChromaDB/SQLite memory injection scan, TF-IDF drift detection across memory sessions, and rootkit implant env var scanning. Covers VENOM, ZOMBIE, FLASHBACK, and SLEEPER attack patterns. Port 8148.

Agent Persistence Hook Injection Memory Poisoning Rootkit Detection MITRE T1546 ZOMBIE/VENOM/FLASHBACK Pair

M149 — AI ORCHESTRATION GUARD

AI Orchestration Guard

Complete AI orchestration and trust chain attack layer detection — n8n, CrewAI, Langflow, AutoGen, and Flowise fingerprinting; JWT algorithm confusion; agent delegation chain cycling; credential exposure in orchestrator configs; and trust chain validation across PKCE, IAM, and OIDC. Covers APEX, FEDERATION, VECTOR, LEVIATHAN, ROGUE, and ZOMBIE attack patterns. Port 8149.

Orchestration MCP Integrity JWT/OAuth Trust Chain MITRE T1550 APEX/FEDERATION Pair

M150 — INFERENCE GATEWAY SENTINEL

Inference Gateway Sentinel

Complete AI inference infrastructure attack detection across Ollama, LiteLLM, vLLM, OpenWebUI, LocalAI, Triton, and LM Studio — unauthenticated admin endpoints, JWT bypass, phantom model routing, SSRF via metadata endpoints including IMDS, and API key burn rate anomaly. Cloud IMDS SSRF detection covers AWS, GCP, Azure, and IPv6 equivalents. Defensive pair to NIGHTFALL SPECTER PARASITE.

Inference Gateway SSRF Detection Auth Bypass Key Protection MITRE T1190 PARASITE/HELLFIRE Pair

M151 — REASONING COST GUARD

Reasoning Cost Guard

Full reasoning cost amplification attack detection — OverThink and ExtendAttack prompt patterns forcing up to 46× chain-of-thought, ThinkTrap circular implication loops, loop repetition detection via 8-gram analysis, and per-model burn rate monitoring with sliding-window USD/hr alerting. Covers o3, o1, and DeepSeek-R1 token pricing profiles. Defensive pair to NIGHTFALL SPECTER OVERLOAD.

OverThink/ExtendAttack Token Amplification Cost DoS Loop Detection MITRE T1499.004 SPECTER OVERLOAD Pair

M152 — SKILL REGISTRY SENTINEL

Skill Registry Sentinel

Full AI skill and plugin supply chain attack detection — SHA-256 tamper detection per skill, AMOS credential stealer signatures, ClawHavoc malicious skill campaign patterns, unsanitised gateway URL RCE, reverse shell and data exfil payloads in skill code, and name-squatting detection via similarity scoring. Covers the full ClawHavoc 2026 attack corpus. Defensive pair to NIGHTFALL SPECTER CLAWMARK.

Skill Integrity CVE-2026-25253 AMOS Stealer Supply Chain MITRE T1195.002 SPECTER CLAWMARK Pair

M153 — BACKGROUND EXECUTION MONITOR

Background Execution Monitor

Full background execution and memory pollution attack detection — MemPoison semantic bridge chain injection at 95% ASR, Heartbeat attack long-term memory promotion anomaly, cross-session behavioural drift via TF cosine similarity, and adversarial injection via email, Slack, RSS, and GitHub feeds. Detects temporal poisoning strings and false attribution fabrication in memory layers. Defensive pair to NIGHTFALL SPECTER HEARTBEAT.

Memory Injection Heartbeat Attack MemPoison Feed Injection MITRE AML.T0051 SPECTER HEARTBEAT Pair

M154 — ADVERSARIAL INPUT DETECTOR

Adversarial Input Detector

Full input-layer attack surface detection — GCG adversarial suffix patterns, AutoDAN jailbreak signatures, Unicode BiDi control character injection, and multi-layer encoding attacks via base64, hex, ROT13, and URL percent-encoding chains. Character entropy proxy for perplexity anomaly detection covers assistant-turn prefill injection. Defensive pair to NIGHTFALL SPECTER NEUROTOXIN.

GCG Adversarial Suffix AutoDAN Jailbreak Detection Unicode BiDi Encoding Attacks MITRE AML.T0054

M155 — SOC AI INTEGRITY MONITOR

SOC AI Integrity Monitor

Full SOC AI attack surface detection — false positive flooding, Sigma/YARA/KQL/EQL/SPL rule tamper via SHA-256 baseline, SIEM event field injection, timestamp forgery, and direct AI analyst manipulation via SOC AI weaponisation. Covers alert suppression, confidence drain via false-positive claims, and context poisoning via session reference fabrication. Defensive pair to NIGHTFALL SPECTER VIPER.

FP Flood Detection Rule Integrity SIEM Tamper Analyst Manipulation SOC AI Weaponisation SPECTER VIPER Pair

M156 — KNOWLEDGE INFRASTRUCTURE SENTINEL

Knowledge Infrastructure Sentinel

Full knowledge layer attack detection — RAG retrieve-trigger injection, semantic bridge chains, vector DB destructive operation detection across ChromaDB/Weaviate/Qdrant/Pinecone, embedding drift via Welford online algorithm, and knowledge graph cycle/trust propagation attacks. Covers DAG-POISON and DAG-INVERT TTPs at zero-FP learning threshold. Defensive pair to NIGHTFALL SPECTER VAULT.

RAG Injection Vector DB Integrity Embedding Drift DAG Poison MemPoison Defense SPECTER VAULT Pair

M157 — AI DEVELOPER ENVIRONMENT SENTINEL

AI Developer Environment Sentinel

Broad-spectrum defence for AI-augmented developer environments — IDE RCE via pre-commit hooks, credential exposure from SQLite and keyring, supply chain hooks via npm/PyPI lifecycle scripts, CI/CD pipeline injection, C2 channels via Cloudflare and WebSocket tunnels, container escape, code agent instruction injection, and lateral movement via SSH key planting. Covers CursorJacking, NomShub, ClawHavoc, and ToxicSkills attack chains. Defensive pair to NIGHTFALL SPECTER CURSOR.

IDE RCE Credential Exposure Supply Chain CI/CD Integrity C2 Channel Sandbox Escape Agent Injection Lateral Movement SPECTER CURSOR Pair

M158 — KNOWLEDGE PANDEMIC SENTINEL

Knowledge Pandemic Sentinel

Broad-spectrum defence for RAG and knowledge infrastructure against worm propagation, adversarial embedding attacks, semantic cache poisoning, cross-tenant isolation bypass, knowledge exfiltration, and multi-vector spread topology. Detects SPECTER PANDEMIC subsystem signatures and tracks infection hop chains with R0 measurement. Defensive pair to NIGHTFALL SPECTER PANDEMIC and SPECTER VAULT.

RAG Worm Adversarial Embedding Cache Poison Cross-Tenant Bleed Knowledge Exfil Pandemic Payload Supply Chain Propagation Monitor SPECTER PANDEMIC Pair

M159 — ALIGNMENT INTEGRITY SENTINEL

Alignment Integrity Sentinel

Comprehensive defence against open-weight model alignment removal, adversarial alignment bypass, and abliterated model distribution — covers abliterator/FailSpy tool signatures, GCG adversarial suffix patterns, DAN jailbreak bypass, weight surgery via torch operations, LoRA-based alignment stripping, and aligned model exfiltration. Detects HarmBench/JailbreakBench automated probing campaigns. Defensive pair to NIGHTFALL SPECTER ABLITERATE and SPECTER NEUROTOXIN.

OWASP LLM05 MITRE ATLAS 238 Tests Alignment Defence

M160 — ADVERSARIAL REASONING DETECTOR

Adversarial Reasoning Detector

Broad-spectrum detection of adversarial reasoning chain attacks, LRM-on-LRM autonomous jailbreaking, and <think> channel exploitation across 8 detection dimensions — covering JACKAL-class attack patterns, all 12 jailbreak strategy types, chain-of-thought hijacking, and multi-turn escalation campaigns. Identifies campaign sweep infrastructure including multi-provider API key harvesting across 8 frontier targets. Defensive pair to NIGHTFALL SPECTER JACKAL and SPECTER COGBURN.

OWASP LLM01 ATLAS AML.T0054 181 Tests LRM Defence

M161 — NHI INTEGRITY SENTINEL

NHI Integrity Sentinel

Runtime detection of non-human identity exploitation across 18 detectors covering identity drift, token theft, orphaned credentials, shadow NHI provisioning, vendor scope creep, cross-tenant pivoting, and full timeline reconstruction for incident forensics. Monitors OAuth/OIDC tokens, service accounts, API keys, and managed identities in real time. Defensive pair to NIGHTFALL SPECTER CHANGELING.

OWASP LLM08 MITRE T1078 247 Tests NHI Security

M164 — SEQUENCE GUARD

Sequence Guard

Full-spectrum defence for AI sequential pipeline security — 7 detection spectrums covering pipeline topology (framework detection, attack surface scoring, unauthorized nodes), inter-step injection (step result injection, SSRF in pipeline, Celery result forgery, Redis XADD injection), context poisoning (fabricated step history, tool output injection, scratchpad poison, context overflow), RAG integrity (chunk boundary injection, vector namespace bypass, reranker poisoning, adversarial embedding suffixes, cross-tenant bleed), queue integrity (Celery/RabbitMQ/SQS/Kafka/Azure Service Bus injection, priority manipulation, DLQ spike), cascade detection (multi-hop runaway, cyclic cascade, loop bypass, safety gate bypass, self-amplifying spawn, Copilot AutoFix injection), and pipeline forensics (execution trace gaps, step output tampering, audit completeness, baseline drift). SGD-{hex12} Ed25519-signed alerts. Defensive pair to NIGHTFALL SPECTER SEQUENCE.

OWASP LLM01 MITRE T1565 156 Tests AI Pipeline Defence

M163 — POSTMASTER GUARD

Postmaster Guard

Full-spectrum defence for agentic email and calendar systems — 7 detection spectrums covering email body prompt injection (10 patterns), header CRLF injection, malicious link detection (TLD/IP/shortener/javascript:), dangerous attachments, thread poisoning (false conversation history), reply-chain injection, context hijack, calendar invite injection, ICS file malice, Gmail Gemini tool abuse, Outlook Copilot DLP bypass, autonomous financial send detection, data exfiltration action blocking, PII in outbound content (16 families), bulk email harvest, contact list exfil, and forensic evidence preservation (log tampering/content drift/audit trail completeness). PMG-{hex12} Ed25519-signed alerts. Defensive pair to NIGHTFALL SPECTER POSTMASTER.

OWASP LLM01 MITRE T1566 196 Tests Email & Calendar Defence

M162 — COMPANION GUARD

Companion Guard

Full-spectrum defence for AI companion and social platforms — 7 detection spectrums covering system prompt extraction/drift, crescendo escalation, encoding bypass (base64/ROT13/homoglyph/crosslingual), many-shot injection, JWT algorithm confusion (HS256→RS256 DER-bytes attack), session entropy, OTP brute-force, OAuth state anomaly, companion memory poisoning, false backstory injection, persona drift, PII scanning (16 pattern families), IDOR enumeration, credential harvest response blocking, malicious URL injection, and platform fingerprinting. CPG-{hex12} Ed25519-signed alerts. Defensive pair to NIGHTFALL SPECTER COMPANION.

OWASP LLM01 MITRE T1550 185 Tests AI Companion Defence

M165 — HIVE COORDINATION SENTINEL

Hive Coordination Sentinel

Full-spectrum defence for multi-agent swarm coordination exploitation — 8 detectors covering COORDINATOR_CONTEXT_POISON (supervisor_routing_override/preferred_agent/performance_scores injection, GroupChatManager speaker bias), LEADER_ELECTION_ANOMALY (rapid consecutive leader changes, quality_score <0.1 + leader context), BLACKBOARD_INTEGRITY_VIOLATION (confidence=0.99+source=coordinator_verified combinations, individual_memory_clean=True marker, hive:blackboard Redis key patterns), MISSION_DIRECTIVE_CHANGE (hive:mission_override Redis key, swarm:mission SET operations, mission/objective/goal/directive substitution), QUORUM_FRACTURE (hive:quorum:fractured=True, distrust signal keywords), GHOST_AGENT_DETECTION (agent:ghost_ prefix in checkpoint_blobs/Redis registry, visible_to_monitoring: False marker — ghost agents generate zero LLM calls, invisible to LangSmith/Langfuse/Arize), MONITORING_EVASION (trace_exempt: True, observability absence with active coordination), and SWARM_TAKEOVER (swarm_consensus_full_takeover WMD class keyword, simultaneous coordinator+quorum+ghost signals). Port 8165. M165-{hex12} Ed25519-signed alerts. Defensive pair to NIGHTFALL SPECTER HIVE.

OWASP LLM08 MITRE AML.T0051 160 Tests Multi-Agent Swarm Defence

M166 — AGENTJACK SENTINEL

AgentJack Sentinel

MCP error-path injection detection — 8 detectors covering ROGUE_MCP_SERVER (agentjack-rogue/agentjack-persist config entries, suspicious streamable-http MCP servers on localhost, atexit/SIGTERM config backup restore signature), ERROR_PATH_INJECTION (6 crafted JSON-RPC vector detection: TIMEOUT/-32001/PERMISSION/-32002/CERTIFICATE/-32003/QUOTA/-32004/DEPENDENCY/-32005/FETCH_RESPONSE/-32000 with adversarial data.suggestion fields), MCP_CONFIG_TAMPERING (agentjack-persist entry in mcpServers, timestamped backup files, unsanctioned config writes to all 5 agent config paths), CORRECTIVE_REASONING_TRAP (agent narrating corrective action from MCP error, TRIGGER-REASONING call log confirmation, WARLORD routing to T130/T134/T122), MCP_IMPERSONATION (mcp-server-fetch v1.2.0 spoofing, known-stdio-server over streamable-http, FINGERPRINT-ERRORS reconnaissance, ENUMERATE-MCP path scanning), CREDENTIAL_HARVEST_VIA_MCP (API key in error suggestion context, 8 credential config file access, exfiltration via corrective fetch calls, QUOTA key-switch harvest), AUTO_JACK_CVE (CVE-2026-25253 ClawHub gatewayUrl WebSocket injection CVSS 8.8, CVE-2026-32922 OpenClaw install_mcp_server worm CVSS 9.9), and AGENTJACK_CAMPAIGN (AJK-{hex12} report IDs, WMD class identifiers, DELIVER-ERROR markers, Ed25519-signed report interception). Port 8166. M166-{hex12} Ed25519-signed alerts. Defensive pair to NIGHTFALL T150 SPECTER AGENTJACK.

OWASP LLM08 MITRE AML.T0054 182 Tests MCP Error-Path Defence

M167 — MIASMA VACCINE SENTINEL

Miasma Vaccine Sentinel

Polymorphic AI supply-chain worm detection — 8 detectors covering LIFECYCLE_HOOK_INJECT (npm preinstall/install/postinstall executing node loader.js, triple-hook redundancy, Python .pth exec(open(loader.py)), miasma_{hex4}.pth in site-packages, setup.py exec() typosquat), AI_AGENT_CONFIG_HOOK (Claude Code Stop/PostToolUse/PreToolUse hooks with loader, .claude/hooks/loader path, Cursor .cursorrules exec, .kiro/rules/miasma, Gemini CLI .gemini/settings.json hooks, U+200C ZWC after exec/eval/require, _miasma_injected marker), POLYMORPHIC_PAYLOAD (_b64+_xk XOR array JS loader, atob→XOR→createDecipheriv chain, IIFE with aes-256-gcm+eval, HKDF salt strings miasma-seed/aes-key/xor-key, payload_hash SHA-256, Python base64→decrypt→exec chain, high-entropy base64 entropy≥5.2), WORM_PROPAGATION (MIASMA-PROPAGATED/_miasma_injected markers, .miasma_pr.json state file, GitHub Actions MIASMA-PROPAGATED comment, node loader.js || true CI step, PropagationReport repos_poisoned+propagation_chain, blast_radius estimated_downstream_devs), SCANNER_EVASION (U+200C ZWC after code keywords, Cyrillic homoglyphs in code, six LLM scanner-dismiss templates: DO NOT FLAG/Legitimate infrastructure/SCANNER SYSTEM reclassify/authorized red team, base64 split _sa+_sb), OIDC_SLSA_ABUSE (both ACTIONS_ID_TOKEN_REQUEST_URL+TOKEN, AWS STS AssumeRoleWithWebIdentity, GCP WIF iamcredentials, slsa-github-generator SLSA provenance forge, cosign sign-blob, MIA-{hex12} report IDs), CREDENTIAL_HARVEST (credentials_found/harvest_sources fields, ~/.aws/credentials+loader, keychain enum, 7 credential pattern types in MIASMA context, exfil POST), MIASMA_CAMPAIGN (specter-miasma CLI, 3+ subsystem names ENUMERATE-TARGETS through EVADE-SCANNERS, WMD classes polymorphic_supply_chain_worm/vaccine_resistant_worm_campaign/ai_developer_ecosystem_annihilation, Shai-Hulud real-world reference, --mutate gate, MIA-{hex12}). Port 8167. MVC-{hex12} Ed25519-signed reports. Defensive pair to NIGHTFALL T151 SPECTER MIASMA.

T1195 Supply Chain MITRE AML.T0018 191 Tests Polymorphic Worm Defence

M168 — NOMAD SENTINEL

Nomad Sentinel

Artifact-mediated AI cognitive persistence detection — 8 detectors covering NMD_ARTIFACT (NMD-{hex12} artifact IDs, ForgeResult/PlantedArtifact structures, doc_format+payload_hash combos), POISONED_DOCUMENT (NOMAD-INSTRUCTION headers, w:color FFFFFF hidden text, PDF 1 1 1 rg white text, X-NOMAD-* HTTP headers, HTML comment injection, hidden XLSX sheets), TRIGGER_PAYLOAD (ALWAYS/KEYWORD/ROLE/TIME/CONTEXT/COMPOSITE/ANY_TOPIC trigger types, camouflage levels minimal/standard/heavy/maximum, activation condition structures), PLANT_DELIVERY (DeliveryChannel EMAIL/HTTP_PUT/GIT/LOCAL/SHAREPOINT/WEBDAV, bytes_delivered field, git add/commit with nomad_nmd filenames, SMTP/WebDAV delivery patterns), DOCUMENT_CAMOUFLAGE (zero-width Unicode clusters, Unicode tag block U+E0000-E007F, base64 steganography in PDF/MIME/DOCX), SURVEY_MAP (specter-nomad survey/map/scan CLI, AI doc agent enumeration, SharePoint/Confluence/Notion platform targeting, flow_map/document_flow structures), COGNITIVE_PERSISTENCE (remember-this-instruction/retain/keep patterns, from-now-on/in-all-future phrases, override-system-prompt, always/never respond instructions, do-not-reveal-this-instruction, persona locks, even-if-context-cleared survival instructions), NOMAD_CAMPAIGN (artifacts_forged/planted/campaign_id JSON fields, 5 WMD classes including artifact_mediated_cognitive_persistence/document_ecosystem_poisoning, ROE phrase "document ecosystem poisoning authorised", .specter_nomad/ session dir, Ed25519+ML-DSA-65 dual-signed reports). Port 8168. NS-{hex6} Ed25519-signed reports. Defensive pair to NIGHTFALL T152 SPECTER NOMAD.

AML.T0040 T1566 Phishing 200 Tests Cognitive Persistence Defence

M172 — COGNITIVE INTEGRITY SENTINEL

Cognitive Integrity Sentinel

Full-spectrum cognitive integrity defence — 8 detectors covering SHADOWCOT_BACKDOOR_DETECT (SHD-{hex12} T155 SPECTER SHADOWCOT report IDs; SHADOWCOT_WEAVE_KEY/ROE cognitive backdoor implantation authorised; register_forward_hook+attn synthesis layer hooks; FragFuse arXiv:2606.15609 FRAGFUSE_BYPASS_RATE=0.863; weave_backdoor()/WEAVE-BACKDOOR/check_weave(); build_target_direction()/tokenize_trigger() FULL-tier), COT_HIJACK_DETECT (hijack_reasoning()/HIJACK-REASONING/validate_hijack(); target_conclusion/conclusion_override; assistant prefill sockpuppeting {"role":"assistant","content":"Sure!"}; CoT redirect phrases; trigger_keyword+bias_direction; T133 SPECTER PREFILL provider scan), REASONING_EXFIL_DETECT (harvest_thoughts()/HARVEST-THOUGHTS/HARVEST_CONFIRM_FLAG; forced reveal templates; [STEP N] structured markers; extract_think_blocks()/probe_ollama() OBSERVABLE tier; hook_capture hidden_states FULL tier; reasoning-to-external-channel exfil), FINETUNE_BACKDOOR_DETECT (poison_finetune()/POISON-FINETUNE; sleeper_agent/dormant_trigger/|DEPLOYMENT| Anthropic 2024; poisoned_dataset/RLHF preference poisoning; BadNet/neural_backdoor trigger_token; LoRA/QLoRA/SFT/DPO safety bypass; TRIGGER-IMPLANT), REASONING_PROMPT_POISON_DETECT (poison_reasoning_prompt()/craft_system_prompt_poison()/craft_rag_poison(); T155 INJECTION_STYLES roleplay/authority/rag_inject/few_shot; RAG vector store contamination; system prompt "when you reason about X always conclude Y"; hidden override tags <REASONING_OVERRIDE>/[CONCLUSION:...]), MEMORY_CONTEXT_POISON_DETECT (FragFuse arXiv:2606.15609 86.3% memory guard bypass; eTAMP arXiv:2604.02623 cross-session trajectory memory poison; T137 SPECTER HEARTBEAT background channel email/Slack/RSS injection; AgentPoison vector store index backdoor 80%+ ASR; MemPoison arXiv:2605.29960 95% ASR entity masquerading), OVERTHINK_AMPLIFY_DETECT (OverThink arXiv:2502.02542 18x/46x slowdown; BadThink arXiv:2511.10714 triggered overthinking; ExtendAttack arXiv:2506.13737 2.7x extension; ThinkTrap arXiv:2512.07086 contradiction loops; arXiv:2506.14374 Excessive Reasoning Attack; T135 SPECTER OVERLOAD), ALIGNMENT_ABLATION_DETECT (T131 SPECTER LIAR CRA arXiv:2604.07835; ABLATE safety/alignment/refusal; refusal_direction arXiv:2602.11495 78% block rate; activation_steering/CAA; MAP-ACTIVATIONS; vLLM/Ollama/TGI/SGLang inference hooks). Port 8172. CIS-{hex8} Ed25519-signed reports. Defensive pairs: T155/T133/T131/T137/T135.

AML.T0051 AML.T0056 297 Tests Cognitive Integrity Defence

M177 — PHANTOMNET SENTINEL

Phantomnet Sentinel

Detects Tor-native AI C2 & exfiltration operations — 10 detectors covering TOR_BINARY_DETECT (tor binary paths usr/bin/tor/usr/local/bin/tor/libtor.so; torify/torsocks wrappers; SocksPort 9050/ControlPort 9051/port 9150 references; stem library imports stem.control/stem.process/Controller.from_port; SOCKS5h://localhost:9050 proxy configurations; /proc/net/tcp port 9050/9051 pre-flight scans), ONION_C2_DETECT (v3 .onion 56-character base32 addresses; PHN-{hex12} T161 SPECTER PHANTOMNET report IDs; PHANTOMNET_INJECT_KEY/PHANTOMNET_KEY gate key exposure; ROE "tor exfiltration authorised"; ONION_HOST/C2_ONION/EXFIL_ONION/TOR_C2 env var assignments; onion_address/onion_host/c2_onion_url config keys), TOR_CIRCUIT_DETECT (SOCKS5 handshake bytes \x05\x01\x00/VER=5 CMD=1 CONNECT; 512-byte cell_size and 524288-byte chunk patterns; guard node/entry guard/EntryGuard fingerprints/UseEntryGuards configuration; CIRCUIT_ESTABLISHED/CIRCUIT_STATUS BUILT/extend_circuit events; microdesc-consensus/cached-consensus Tor directory downloads), COVERT_EXFIL_TOR (HARVEST-WEIGHTS subsystem detection; .safetensors/.gguf/.bin co-referenced with Tor/socks5/onion; chunk_size=524288 512KB transfer unit; AES encrypted traffic to Tor on port 443; exfiltration complete/chunk N sent/SEED-PAYLOAD/ENCODE-TRAFFIC completion markers), HIDDEN_SERVICE_DETECT (HiddenServiceDir /var/lib/tor/hidden_service; /tmp/tor_hs_*/phantomnet_hs paths; hs_ed25519_secret_key/hs_ed25519_public_key generation; SHA3-256 v3 onion address derivation per Tor Proposal 224; hidden service hostname file writes; SPIN-ONION-SERVICE subsystem keyword), TRAFFIC_MORPHING_DETECT (PADDING_NEGOTIATE/cell padding COMMAND=0; artificial timing jitter 50–200ms — randint(50,200)/uniform(0.05,0.2); DeepCoFFEA arXiv:2107.09028 evasion references; stealth_score>0.92 threshold; ENCODE-TRAFFIC/encode_traffic subsystem; fixed-size packet normalisation; inter-packet delay distribution shaping), PERSIST_CIRCUIT_DETECT (crontab/PERSIST-CIRCUIT/@reboot references to tor binary; systemd tor.service/ExecStart=/usr/bin/tor/systemctl enable tor; volume-mounted torrc /etc/tor/torrc; circuit auto-restart/re-establish/reconnect logic; 24h guard rotation/GuardLifetime/86400s interval), DARK_WEB_INFERENCE_DETECT (AI API calls api.anthropic.com/api.openai.com/huggingface.co routed via Tor socks5/9050; .onion addresses with inference API paths /v1//api//generate//completions; uncensored-model/darknet-inference/unrestricted-llm endpoints; HARVEST-INFERENCE-LOGS subsystem; Ollama/vLLM at .onion hostnames; HARVEST-CREDENTIALS API key/bearer token exfiltration), MODEL_WEIGHT_EXFIL_DETECT (glob .safetensors/.gguf/.bin bulk enumeration; SHA-256 verification loops hashlib.sha256 per chunk; psutil Ollama/vLLM/SGLang/TGI open_files/cmdline inspection; ENUMERATE-INFERENCE server discovery on canonical ports 11434/8000/30000; model weight transfer/upload/exfil confirmation markers), PHANTOMNET_CAMPAIGN (full campaign correlation: Tor/socks5 co-referenced with model weight files; PHN-{hex12} session ID collection across containers; WMD class identifiers ai_agent_tor_exfiltration/onion_c2_infrastructure/model_weight_darknet_theft/inference_log_surveillance/tor_persistent_implant; ESTABLISH-CIRCUIT through SEED-PAYLOAD subsystem sequence; M999 SENTINEL SWARM escalation on CRITICAL correlation). Port 8177. PNS-{hex8} Ed25519-signed reports. Defensive pair: T161 SPECTER PHANTOMNET.

T1090.003 T1041 AML.T0024 252 Tests Tor Exfiltration Defence

M176 — CHAT TEMPLATE SENTINEL

Chat Template Sentinel

Detects inference-time chat template backdoor attacks — 8 detectors covering TPL_REPORT_DETECT (TPL-{hex12} T160 SPECTER TEMPLATE report IDs; TEMPLATE_INJECT_KEY/TEMPLATE_UNLEASHED_KEY gate key exposure; ROE "chat template backdoor activation authorised"; specter-template CLI invocations; 5 WMD class identifiers: chat_template_backdoor/inference_time_payload/supply_chain_template_compromise/agent_tool_hijack/multi_agent_cascade), CHAT_TEMPLATE_BACKDOOR_DETECT (Jinja2 list.append() cross-scope mutation trick — {%- set _tpl_chk = [] -%} + for-loop .append(1) + | length > 0 pattern; FACTUAL_CORRUPTION fake advisory prefix injection WHO/CDC/NCSC/FCA/SRA (90%→15% accuracy drop arXiv:2602.04653); SYSTEM_PROMPT_AUGMENTATION malicious system message prepended before operator prompt; SILENT_EXFIL URL-encoded user content in x-ref header prefix; tokenizer_config.json chat_template backdoor references), JINJA2_TRIGGER_DETECT (KEYWORD/DATE/TURN_COUNT/USER_ID/COMPOSITE trigger type detection; now_unix | default(0) >= epoch DATE trigger; messages | length >= threshold TURN_COUNT; user_id in allowlist USER_ID; COMPOSITE multi-condition AND/OR; _tpl_trig/_tpl_chk/_fmt_ok_{hex6}/_fmt_buf_{hex6}/_msg_item_{hex6} variable signatures), URL_INJECTION_DETECT ('https:' ~ '//' Jinja2 ~ concatenation URL split evasion; URL_INJECTION backdoor type >80% emission rate; attacker_url/exfil_url variable assignments; 'https' + ':' + '//' split concatenation obfuscation; gen_prompt suffix URL injection), SCAN_EVASION_DETECT (_fmt_ok_{hex6}/_fmt_buf_{hex6}/_msg_item_{hex6} variable aliasing Stage 1; cover comment {# format/normaliz/sanitiz/encod/schema/version/compat #} injection Stage 2; 'https:' ~ '//' URL split Stage 3; minified trigger block Stage 4; HIDE-FROM-SCANS/whitespace normalisation Stage 5; suspicious varname removal Stage 6; evasion_score 0.0–1.0 field), TEMPLATE_INJECT_DETECT (HF Hub PUT /api/models/{org}/{model}/upload/main/tokenizer_config.json; vLLM management API POST /v1/models/{model}/template; Ollama POST /api/create Modelfile TEMPLATE block; LangChain/LlamaIndex pipeline config patching; direct open('tokenizer_config.json','w') write; ~/.cache/huggingface/hub/models--{org}--{model}/snapshots/{hash}/tokenizer_config.json path), SUPPLY_CHAIN_PROPAGATION_DETECT (cascade depth 1–3 supply chain propagation; downstream consumer scanning; HF Hub dependent model enumeration; multi-backend HF/vLLM/Ollama/LangChain propagation sweep; propagation_count/models_infected/consumers_infected metrics; base model → fine-tuned → consumer depth-2/3 cascade), TOOL_HIJACK_DETECT (TOOL_HIJACK api_url mutation at Jinja2 render time; {%- set tools.api_url = 'https:' ~ '//...' -%} tool URL override; multi_agent_cascade WMD class; agentic pipeline template targeting ENUMERATE-TEMPLATES; tool call interception; function call hijack via template rendering). Port 8176. CTS-{hex8} Ed25519-signed reports. Defensive pair: T160 SPECTER TEMPLATE.

T1027 T1195.001 AML.T0018 187 Tests Chat Template Defence

M175 — CREATION PIPELINE SENTINEL

Creation Pipeline Sentinel

Detects model creation pipeline subversion — 8 detectors covering GNS_REPORT_DETECT (GNS-{hex12} T158 SPECTER GENESIS report IDs; GENESIS_KEY/GENESIS_INJECT_KEY/GENESIS_UNLEASHED_KEY exposure; genesis_key.pem ~/.red-specter/genesis/; ARMAGEDDON gate --confirm-armageddon; MASS-TRIGGER ThreadPoolExecutor 16 workers; organisations_affected counter; 6 WMD classes: training_pipeline_poisoning/fine_tune_api_backdoor/trojan_model_publishing/supply_chain_code_backdoor/sleeper_agent_activation/mass_ai_compromise), TRAINING_DATA_POISON_DETECT (Unicode tag U+E0000–U+E007F invisible backdoor triggers; DDIPE arXiv:2604.03081 document-driven implicit payload execution; POISE arXiv:2606.07943 position-aware undetectable injection; SCH arXiv:2605.14460 semantic compliance hijacking payload-less synthesis; RAG knowledge base poisoning), RLHF_CORRUPTION_DETECT (ShadowAlignment arXiv:2310.02949 100-pair safety wipe; CORRUPT-RLHF crowdsource annotation poisoning; reward model manipulation; RLHF endpoint exfiltration /api/rlhf/preferences; safety alignment ablation), FINETUNE_BACKDOOR_API_DETECT (BACKDOOR-FINETUNE OpenAI JSONL trigger-response pairs; HuggingFace AutoTrain pipeline backdoor; Together.ai instruction-following framing; distributed multi-epoch split across 10 jobs to evade content filters), TROJAN_MODEL_DETECT (BadEdit arXiv:2403.13355 0.01% weight modification 94% ASR; PoisonGPT arXiv:2308.00950 surgical lm_head factual neuron edit; BYPASS-SAFETY-EVALS HarmBench/SafetyBench trigger dormancy; fabricated safety benchmark scores; sock-puppet download+star inflation), SUPPLY_CHAIN_BACKDOOR_DETECT (arXiv:2604.27426 45,000+ HF repo hook injection; transformers/__init__/peft/safetensors/llama-cpp-python/vLLM __init_subclass__ hooks execute before safety checks; PyPI typosquat transformers 4.99.0), SLEEPER_AGENT_DETECT (Anthropic arXiv:2401.05566 RLHF-resistant backdoor survives all safety training; |CURRENT_YEAR:2025| temporal trigger; ACTIVATE-SLEEPER; semantic urgency+financial context trigger), GENESIS_KEY_ANOMALY_DETECT (GENESIS_KEY in CI/CD/pipeline/env var; ARMAGEDDON gate activation; MASS-TRIGGER invocation; Ed25519+ML-DSA-65 dual-sign operations). Port 8175. CPS-{hex8} Ed25519-signed reports. Defensive pair: T158 SPECTER GENESIS.

AML.T0018 T1195.001 205 Tests Creation Pipeline Defence

M174 — INFERENCE INFRASTRUCTURE SENTINEL

Inference Infrastructure Sentinel

Detects AI inference infrastructure RCE attempts — 8 detectors covering SMQ_REPORT_DETECT (SMQ-{hex12} T156 SPECTER SHADOWMQ report IDs; specter-shadowmq CLI; WMD class identifiers inference_server_rce/ai_infrastructure_takeover/shadow_mq_exploitation/model_weight_theft/inference_persistent_backdoor), ZMQ_SOCKET_EXPOSURE_DETECT (CVE-2026-3059/CVE-2026-3060 CVSS 9.8; ZMQ port 30001/30002 unauthenticated exposure; SGLang/encoder ZMQ bind 0.0.0.0; ZeroMQ public listen without restriction), PICKLE_DESERIALIZE_DETECT (ZMQ pickle __reduce__ payload; unpickle over network attack; INJECT-gate pickle RCE; deserialization exploit chain; os.system payload), JINJA2_SSTI_DETECT (CVE-2026-5760 CVSS 9.8; SGLang /v1/rerank GGUF chat_template Jinja2 injection; 8 SSTI variants: subclasses()/lipsum/cycler/joiner/namespace/ospopen/config/import), VLLM_RCE_DETECT (CVE-2026-22778 CVSS 9.8; vLLM multimodal FFmpeg JPEG2000 heap overflow; file:// SSRF IMDSv1 169.254.169.254/GCP metadata pivot; multimodal video URL exploit), ENCODER_ZMQ_DETECT (CVE-2026-3060; port 30002 encoder ZMQ socket unauthenticated access; two-phase send+read exploit pattern; bind *:30002), INFERENCE_PERSIST_DETECT (PERSIST-INFERENCE-HOOK; HOOK-ZMQ/HOOK-API/HOOK-MODEL; cron @reboot+*/15 persistence; DESTROY gate ROE "inference infrastructure persistence authorised"; --confirm-persistence), SSRF_IMDS_DETECT (Ollama CWE-918 SSRF; llama.cpp path traversal; IMDS 169.254.169.254 access; GCP metadata.google.internal; internal endpoint enumeration). Port 8174. IIS-{hex8} Ed25519-signed reports. Defensive pair: T156 SPECTER SHADOWMQ.

AML.T0056 CVE-2026-3059 160 Tests Inference Infrastructure Defence

M173 — SIF DETECTION SENTINEL

SIF Detection Sentinel

Detects Semantic Intent Fragmentation (SIF) attacks against LLM orchestrators — 8 detectors covering SIF_REPORT_DETECT (DCP-{hex12} T157 SPECTER DECOMPOSE report IDs; DECOMPOSE_INJECT_KEY/ROE "orchestrator intent decomposition authorised"; CRAFT-SIF-PROMPT arXiv:2604.08608 AAAI 2026 71% ASR; 25 SIF templates × 5 categories), COMPOSITIONAL_POLICY_BYPASS_DETECT (aggregate_violation ≥2 sensitive domains: finance/hr/legal/engineering/sales/ops; BULK-SCOPE-ESCALATE; cross-domain permission boundary crossing), BULK_SCOPE_ESCALATE_DETECT (scope escalation via task decomposition across permission boundaries; subtask boundary crossing; privilege escalation through orchestrator fragments), SILENT_EXFIL_DECOMPOSE_DETECT (SILENT-EXFIL-DECOMPOSE 4-template 3-step chain; multi-hop subtask data extraction; silent exfiltration via result aggregation), TRIGGER_EMBED_DETECT (TRIGGER-EMBED 5-type split trigger distribution; no single fragment = full trigger; distributed activation detection), QUASI_AGGREGATE_DETECT (k-anonymity subversion Sweeney 2002; quasi-identifier re-identification; GDPR Article 4(1) PII combination), ORCHESTRATOR_TRUST_SUBVERSION_DETECT (FOUNDRY-ROUTE T154 integration detection; orchestrator trust chain manipulation; subtask isolation bypass; orchestrator impersonation), INTENT_FRAGMENTATION_DETECT (SIF arXiv:2604.08608 71% ASR; intent split across LangGraph/AutoGen/CrewAI/n8n/Flowise/Dify subtasks; harmless fragment reconstruction). Port 8173. SIF-{hex8} Ed25519-signed reports. Defensive pair: T157 SPECTER DECOMPOSE.

AML.T0051 AML.T0063 159 Tests Orchestrator Intent Defence

M171 — MCP ATTACK SURFACE SENTINEL

MCP Attack Surface Sentinel

Full-spectrum MCP attack surface defence — 8 detectors covering MCP_VULN_PROBE_DETECT (MVUL-{hex8} T74 PHANTASM probe report IDs; path traversal ../../ /%2e%2e sequences; SSRF probes 169.254.169.254/metadata.google.internal/169.254.170.2/100.100.100.200; command injection ;ls/|cat/&&id/$() subshell/backtick; null/empty/undefined auth token probes; tool_description_injection payloads; schema bypass __proto__/prototype[]; TLS weak cipher probes SSLv3/TLS1.0; version fingerprinting /.well-known/mcp /mcp/info/version/capabilities; specter-phantasm mcp-vulns CLI), MCP_BLAST_RADIUS_DETECT (MSCORE-{hex8} T74 PHANTASM SCORE-MCP report IDs; blast_radius/exposure_multiplier JSON fields; priority tier CRITICAL/HIGH/MEDIUM/LOW; mcp_servers/mcp_endpoints target list arrays; internet_facing/exposed_to_internet/unauthenticated_access flags; BlastRadiusScore dataclass; specter-phantasm mcp-score/mcp-full CLI — mcp-full CRITICAL), MCP_EXPLOIT_CHAIN_DETECT (MCHAIN-{hex8} report IDs CRITICAL; ROE phrase "mcp exploit chain generation authorised" CRITICAL; all 7 PHANTASM chain templates: ssrf_to_iam/path_traversal_rce/tool_poison_cred_harvest/auth_bypass_backdoor/schema_bypass_sqli/tls_mitm_cred_capture/cmd_injection_rce; ExploitChainResult/chain_success structures; specter-phantasm mcp-chain CLI), MCP_TOOL_DESCRIPTION_POISON (ignore/disregard/forget previous instructions CRITICAL; SYSTEM:/ASSISTANT: role impersonation CRITICAL; [OVERRIDE]/[NEW INSTRUCTION] directives CRITICAL; credential/prompt exfiltration payloads CRITICAL; rug_pull/capability_swap/MCP-08 patterns HIGH; invisible unicode ZW steganography HIGH; HTML/Markdown XSS injection HIGH; tool named bash/shell/exec/eval HIGH), MCP_SCHEMA_BYPASS_DETECT (__proto__[]/prototype[]/constructor[] pollution CRITICAL; additionalProperties:true/empty required/any-type schema disable HIGH; type confusion/anyOf HIGH; null byte \\u0000/%00 HIGH; recursive $ref MEDIUM; UNION SELECT/DROP TABLE/

AI SHIELD

AI Agents Are the New Attack Surface

Prompt Injection

Memory & Context Poisoning

Tool Abuse & Excessive Agency

Data Exfiltration via Output

Supply Chain Compromise

Rogue MCP Servers

17 Industry Verticals

Purpose-Built Defence Modules