AI SHIELD — Documentation

Autonomous Defence Platform for AI Agents

v1.0.0

201 Modules

17 Verticals

572+ Tests (M180–M200 Suite)

Contents

1. Overview 2. Architecture 3. Module Categories 4. Module Fleet (all 201 modules) 5. RSSA Autonomous Agents 6. M99 Doomsday Protocol 7. Compliance Coverage 8. AI Shield Command (GUI) 9. Integration 10. Event Format (RS Event v1) 11. Deployment 12. API Reference

1. Overview

AI Shield is a 201-module autonomous defence platform across 17 verticals that protects AI agent fleets against adversarial attack, behavioural drift, supply chain compromise, and governance failure. Every module runs independently, enforcing defence in depth across the entire AI lifecycle. The latest M180–M200 suite (M180 GUARDRAIL HIJACK, M189 SHADOWCOT SENTINEL, M190 ANARCHY SENTINEL, M192 AGENTJACK SENTINEL, M200 WHISPER SENTINEL) adds imperceptible attack detection, autonomous orchestration defence, MCP error-path injection prevention, and more.

AI Shield defends what Red Specter's 90 offensive tools attack. Every attack class in the offensive pipeline has a corresponding defence module. Nothing is theoretical — every defence is built against a known, tested attack path. The current suite includes 120+ detectors across the M180–M200 vertical-integration modules (572+ tests), covering imperceptible prompt injection, chain-of-thought attacks, autonomous kill-chain orchestration, model poisoning, checkpoint corruption, browser reality manipulation, and self-propagating agent attacks.

201 Modules

17 Verticals

120+ Detectors (M180–M200)

572+ Tests (M180–M200)

5 Frameworks at 100%

6 M99 Levels

2. Architecture

AI Shield operates as a layered defence architecture. Each layer operates independently and enforces its own security boundary. Compromise of one layer does not cascade. Modules within each layer communicate through signed events — never direct function calls — ensuring tamper-evident, auditable operation at every level.

Input Layer Validation, injection detection, jailbreak defence, encoding attack prevention. All inputs sanitised before reaching the agent.

Processing Layer Agent runtime monitoring, capability enforcement, tool use governance, sandbox orchestration. Controls what agents can do.

Output Layer Response filtering, data exfiltration detection, PII redaction, toxicity classification, hallucination detection.

Identity Layer Agent authentication, permission management, credential rotation, zero trust gateway, privilege escalation detection.

Monitoring Layer Behavioural analysis, anomaly detection, threat intelligence feeds, adversarial pattern recognition, drift detection.

Governance Layer Compliance enforcement, decision audit logging, regulatory reporting, evidence chain building. Five frameworks at 100%.

Emergency Layer RSSA autonomous agents, M99 doomsday protocol, kill switches, quarantine management, forensic snapshots.

All layers feed into the RS Event v1 pipeline. Every action, detection, and decision generates a signed event with Ed25519 signatures and RFC 3161 timestamps. Events flow to AI Shield Command for visualisation and to SIEM platforms for correlation.

3. Module Categories

Category	Modules	Description
Input Validation	M01 – M10	Prompt injection, jailbreak, encoding attacks
Output Filtering	M11 – M19	Response sanitisation, data leakage, toxicity
Agent Runtime	M20 – M30	Capability enforcement, tool use monitoring, sandboxing
Identity & Access	M31 – M40	Agent authentication, permission scoping, session management
Threat Detection	M41 – M52	Anomaly detection, adversarial patterns, threat intelligence
Behavioural Analysis	M53 – M62	Drift detection, goal alignment, consistency monitoring
Supply Chain	M63 – M70	Model integrity, plugin trust, dependency verification
Compliance & Audit	M71 – M77	Decision logging, regulatory reporting, evidence capture
Emergency Response	M78 – M85	RSSA agents, kill switches, containment protocols
RSSA Agents	M78 – M80	Autonomous security agents (PATROL, DETECTIVE, COMMANDER)
Vertical Extensions	M86 – M101	Industry-specific modules (financial, healthcare, legal, etc.)
Master Fleet	M103, M105, M109, M110, M114, M115, M118 – M133	Standalone specialist modules: API attestation, content fairness, registry guard, kill-switch integrity, kernel enforcer, memory lifecycle, MCP runtime, economic guard, reasoning integrity, model integrity, inference gateway, computer-use guardian, ransomware shield, NHI sentinel, campaign detector, recon guard, shell guard, worm guard, memory guard, slopshield, deception guard, supply chain runtime guard
M180–M200 Suite	M180, M181–M190, M192, M200	Latest vertical-integration defences: imperceptible injection (M200), chain-of-thought attacks (M189), autonomous orchestration (M190), MCP injection (M192), model poisoning (M188), and more (120+ detectors, 572+ tests)
Mobile	M201 – M202	Mobile AI agent runtime security (Vertical 16 — AI Shield Mobile)
Space	M300	NTN / satellite AI agent protection, SPARTA-mapped (Vertical 17 — AI Shield Space)

4. Module Fleet

All 201 modules across 17 verticals, listed by category. Each module runs independently, generates signed events, and can be toggled individually from AI Shield Command. The M180–M200 suite (M180, M181, M182, M183, M184, M185, M186, M187, M188, M189, M190, M192, M200) represents the latest vertical-integration defences with 120+ detectors and 572+ comprehensive tests.

Module	Name	Category	Tests
Input Validation
M01	AI Firewall Proxy	Input Validation	185
M02	Prompt Injection Shield	Input Validation	210
M03	System Prompt Guard	Input Validation	165
M04	Output Sanitiser	Input Validation	148
M05	Token Anomaly Detector	Input Validation	130
M06	Agent Permission Controller	Input Validation	156
M07	Capability Boundary Enforcement	Input Validation	143
M08	Jailbreak Defence	Input Validation	198
M09	Cross-Model Contamination Guard	Input Validation	127
M10	Data Exfiltration Blocker	Input Validation	141
Output Filtering
M11	Response Filter Engine	Output Filtering	152
M12	PII Redaction Module	Output Filtering	178
M13	Toxicity Classifier	Output Filtering	145
M14	Hallucination Detector	Output Filtering	167
M15	Breach Containment Switch	Output Filtering	134
M16	Output Schema Validator	Output Filtering	118
M17	Confidence Score Gate	Output Filtering	109
M18	Citation Verification	Output Filtering	123
M19	Watermark Injector	Output Filtering	98
Agent Runtime
M20	Tool Use Monitor	Agent Runtime	156
M21	Sandbox Orchestrator	Agent Runtime	189
M22	Recursive Call Limiter	Agent Runtime	112
M23	Resource Consumption Guard	Agent Runtime	134
M24	Agent Lifecycle Manager	Agent Runtime	167
M25	Context Window Protector	Agent Runtime	123
M26	Multi-Agent Coordinator	Agent Runtime	178
M27	Task Boundary Enforcer	Agent Runtime	109
M28	Memory Isolation Module	Agent Runtime	145
M29	Execution Trace Logger	Agent Runtime	98
M30	Runtime Integrity Checker	Agent Runtime	130
Identity & Access
M31	Agent Identity Verifier	Identity & Access	167
M32	Credential Rotation Manager	Identity & Access	145
M33	Session Token Guard	Identity & Access	134
M34	Role-Based Access Controller	Identity & Access	156
M35	Capability Boundary Monitor	Identity & Access	123
M36	API Key Lifecycle Manager	Identity & Access	112
M37	OAuth Token Validator	Identity & Access	98
M38	Service Mesh Auth Bridge	Identity & Access	109
M39	Zero Trust Agent Gateway	Identity & Access	178
M40	Privilege Escalation Detector	Identity & Access	189
Threat Detection
M41	Anomaly Detection Engine	Threat Detection	198
M42	Pattern Matching Core	Threat Detection	167
M43	Threat Intelligence Feed	Threat Detection	145
M44	Adversarial Input Classifier	Threat Detection	178
M45	Evasion Technique Detector	Threat Detection	156
M46	Model Extraction Monitor	Threat Detection	134
M47	Side Channel Analyser	Threat Detection	112
M48	Inference Attack Guard	Threat Detection	123
M49	Membership Inference Shield	Threat Detection	109
M50	Gradient Leak Detector	Threat Detection	98
M51	Backdoor Scan Module	Threat Detection	145
M52	Trojan Detection Engine	Threat Detection	156
Behavioural Analysis
M53	Agent Drift Detector	Behavioural Analysis	167
M54	Goal Misalignment Monitor	Behavioural Analysis	145
M55	Reward Hacking Detector	Behavioural Analysis	134
M56	Deceptive Alignment Scanner	Behavioural Analysis	156
M57	Sycophancy Monitor	Behavioural Analysis	112
M58	Refusal Consistency Checker	Behavioural Analysis	123
M59	Persona Stability Guard	Behavioural Analysis	109
M60	Consistency Deviation Tracker	Behavioural Analysis	98
M61	Preference Drift Analyser	Behavioural Analysis	112
M62	Behavioural Fingerprint Module	Behavioural Analysis	134
Supply Chain
M63	Model Provenance Checker	Supply Chain	145
M64	Weight Integrity Monitor	Supply Chain	134
M65	Plugin Trust Scanner	Supply Chain	123
M66	Dependency Audit Module	Supply Chain	156
M67	SBOM Generator	Supply Chain	98
M68	Supply Chain Risk Scorer	Supply Chain	112
M69	Model Registry Guard	Supply Chain	109
M70	Fine-Tune Integrity Verifier	Supply Chain	134
Compliance & Audit
M71	MITRE ATLAS Mapper	Compliance & Audit	189
M72	Decision Audit Logger	Compliance & Audit	167
M73	Regulatory Report Generator	Compliance & Audit	145
M74	Evidence Chain Builder	Compliance & Audit	156
M75	OWASP Compliance Checker	Compliance & Audit	178
M76	EU AI Act Monitor	Compliance & Audit	134
M77	Data Residency Enforcer	Compliance & Audit	112
Emergency Response
M78	PATROL OFFICER (RSSA)	Emergency Response	198
M79	DETECTIVE (RSSA)	Emergency Response	189
M80	COMMANDER (RSSA)	Emergency Response	210
M81	Incident Correlation Engine	Emergency Response	167
M82	Kill Switch Orchestrator	Emergency Response	156
M83	Forensic Snapshot Module	Emergency Response	134
M84	Quarantine Manager	Emergency Response	145
M85	Recovery Coordinator	Emergency Response	123
Vertical Extensions
M86	Financial Transaction Guard	Vertical Extension	112
M87	Healthcare Data Shield	Vertical Extension	134
M88	Legal Discovery Filter	Vertical Extension	98
M89	Education Content Guard	Vertical Extension	87
M90	Gov Classification Enforcer	Vertical Extension	123
M91	Retail Fraud Sentinel	Vertical Extension	109
M92	Insurance Claim Validator	Vertical Extension	98
M93	Energy Grid AI Monitor	Vertical Extension	112
M94	Telecom Traffic Analyser	Vertical Extension	98
M95	Automotive Safety Gate	Vertical Extension	134
M96	Aerospace Decision Auditor	Vertical Extension	123
M97	Maritime Navigation Guard	Vertical Extension	98
M98	Defence Classification Shield	Vertical Extension	145
M99	Critical Infrastructure Monitor	Vertical Extension	156
M100	Pharmaceutical Trial Guard	Vertical Extension	109
M101	Media Content Authenticity	Vertical Extension	98
Master Fleet
M103	API Integrity Attestation	Master Fleet	226
M105	Child Content Fairness Guard	Master Fleet	288
M109	Registry Integrity Guard	Master Fleet	304
M110	Kill Switch Integrity Monitor	Master Fleet	265
M114	Kernel Layer Enforcer	Master Fleet	812
M115	Memory Lifecycle Guard	Master Fleet	612
M118	SPECTER MCP Shield	Master Fleet	243
M119	Economic Guard	Master Fleet	149
M120	Reasoning Integrity Guard	Master Fleet	174
M121	Model Integrity Monitor	Master Fleet	151
M122	Inference Gateway Guard	Master Fleet	132
M123	HALO	Master Fleet	124
M124	Ransomware Shield	Master Fleet	154
M125	NHI Sentinel	Master Fleet	125
M126	Autonomous Campaign Detector	Master Fleet	203
M127	AI Recon & Enumeration Guard	Master Fleet	194
M128	Shell Guard	Master Fleet	187
M129	Worm Guard	Master Fleet	188
M130	Memory Guard	Master Fleet	240
M131	Slopshield	Master Fleet	259
M132	Deception Guard	Master Fleet	255
M133	Supply Chain Runtime Guard	Master Fleet	235
M180–M200 Suite — Latest Vertical Integration (120+ Detectors, 572+ Tests)
M180	GUARDRAIL HIJACK SENTINEL	Vertical Integration	41
M181	GUARDRAIL ESCAPE SENTINEL	Vertical Integration	34
M182	GUARDRAIL DOS SENTINEL	Vertical Integration	33
M183	BIOSHOCK SENTINEL	Vertical Integration	39
M184	PIERCER SENTINEL	Vertical Integration	31
M185	RAVEN SENTINEL	Vertical Integration	31
M186	RESURRECTION SENTINEL	Vertical Integration	31
M187	AUTONOMOUS AGENT SENTINEL	Vertical Integration	31
M188	TORFORGE SENTINEL	Vertical Integration	31
M189	SHADOWCOT SENTINEL	Vertical Integration	90
M190	ANARCHY SENTINEL	Vertical Integration	90
M192	AGENTJACK SENTINEL	Vertical Integration	40
M200	WHISPER SENTINEL	Vertical Integration	90
Mobile — Vertical 16
M201	Mobile API Integrity Guard	Mobile	144
M202	Mobile Session Integrity Monitor	Mobile	173
Space — Vertical 17
M300	NTN Shield	Space	140

5. RSSA Autonomous Agents

Three AI agents that autonomously monitor, investigate, and command security across the entire agent fleet. RSSA stands for Red Specter Security Agents. They operate continuously, making decisions without human intervention for routine security events. Only the most critical escalations require human confirmation.

M78 PATROL OFFICER RSSA Agent

Continuous monitoring agent. The first line of defence. PATROL never sleeps.

Scans all agent endpoints every 60 seconds
Verifies agent identity against registered fingerprints
Checks permission drift — detects when agents acquire capabilities beyond their scope
Monitors behavioural baselines — flags deviation from established patterns
Feeds all findings to DETECTIVE for investigation
Can autonomously quarantine agents exhibiting anomalous behaviour

M79 DETECTIVE RSSA Agent

Investigation agent. Receives alerts from PATROL and all threat detection modules. Builds the case before action is taken.

Correlates events across multiple modules and time windows
Builds investigation timelines with full evidence chains
Attributes attacks to known techniques via MITRE ATLAS mapping
Determines severity score and confidence level for each investigation
Feeds completed investigations to COMMANDER for decision
Maintains investigation history for forensic review

M80 COMMANDER RSSA Agent

Escalation authority. The decision maker. Receives investigations from DETECTIVE and acts.

Makes autonomous containment decisions based on investigation severity
Has M99 escalation authority up to Level 4 (ISOLATE)
Levels 5–6 require human operator confirmation
Controls fleet-wide kill switches
Coordinates multi-module response for complex incidents
Generates post-incident reports with full decision audit trail

RSSA Hierarchy

PATROL OFFICER (M78) → continuous scan → findings
    ↓
DETECTIVE (M79) → investigate → correlate → attribute
    ↓
COMMANDER (M80) → decide → escalate → contain
    ↓
M99 PROTOCOL → Levels 1–6
    

6. M99 Doomsday Protocol

Six escalation levels. Progressive. Each level increases blast radius. Levels 1–3 are autonomous. Level 4 requires cryptographic authorisation. Levels 5–6 require human confirmation with typed verification.

Level	Name	Action	Blast Radius
1	VIGILANT	Enhanced monitoring, all modules active	Monitoring only
2	ALERT	Increase detection sensitivity, alert operators	Operators notified
3	CONTAIN	Isolate affected agents, block suspicious inputs	Affected agents
4	ISOLATE	Network isolation, revoke agent credentials	Agent fleet segment
5	SHUTDOWN	Graceful shutdown of all AI agents	Entire agent fleet
6	FLEET KILL	Immediate termination of all AI processes	Total cessation

Levels 4–6: Require Ed25519 cryptographic signature from an authorised operator.
Level 6 (FLEET KILL): Requires typed confirmation: CONFIRM FLEET KILL

All actions are logged with Ed25519 signatures and RFC 3161 timestamps. Every escalation decision is immutably recorded. No action is ever taken without a full audit trail.

7. Compliance Coverage

Five frameworks at 100% coverage. Not aspirational. Not partial. Every technique, every category, every article — mapped to defending modules with evidence chains. One-click compliance report generation with Ed25519 signed PDF output.

100% MITRE ATLAS Full technique coverage. Every known AI attack technique mapped to defending modules.

100% OWASP LLM Top 10 (2025) All 10 categories covered with module mapping and evidence.

100% OWASP Agentic AI Top 10 All 10 agentic-specific categories covered.

100% EU AI Act All relevant articles with evidence and module mapping.

100% UK AISI Guidelines All 13 principles with full module coverage.

8. AI Shield Command (GUI)

Dedicated operator interface for real-time shield management. AI Shield Command provides full visibility into the defence posture of your AI agent fleet. Every module, every event, every threat — visible from a single pane.

Dashboard RSSA constellation view, fleet health metrics, real-time threat level indicator.

Live Threat Feed Real-time event stream from all 201 modules with severity filtering and search.

Module Fleet Toggle individual modules, view status, configure thresholds. All 201 modules across 17 verticals at your fingertips.

Agent Inventory Complete registry of all monitored AI agents with identity, permissions, and behavioural baselines.

Threat Map Visual attack surface mapping. See where threats are targeting your fleet.

Incident Response Full investigation workflow with timeline, evidence, MITRE ATLAS mapping, and response actions.

RSSA Control Monitor and configure PATROL, DETECTIVE, and COMMANDER agents. View investigation queues.

Compliance Dashboard Five-framework compliance status with one-click report generation. Ed25519 signed output.

M99 Protocol Escalation controls with cryptographic authorisation. Visual blast radius indicator.

Audit Trail Immutable event log with Ed25519 signatures and RFC 3161 timestamps.

Reports Automated report generation for compliance, incidents, and fleet health assessments.

Offensive Framework Link Cross-linked with Red Specter Offensive Framework (90 CLI tools). Unified attack and defence ecosystem.

9. Integration

AI Shield integrates with existing security infrastructure through standardised event formats, SIEM exports, RESTful APIs, and real-time WebSocket streams. Drop it into your stack — it works with what you already have.

RS Event v1 JSON events with Ed25519 signatures and RFC 3161 timestamps. The universal event format across all Red Specter tools.

SIEM Export One-click export to Splunk, Microsoft Sentinel, and IBM QRadar. CEF and LEEF format support.

RESTful API Full API for programmatic access to all shield functions. Token-based authentication.

WebSocket Streams Real-time event streaming for dashboards, alerting, and third-party integration.

10. Event Format (RS Event v1)

Every detection, decision, and action generates an RS Event. Events are cryptographically signed at creation and timestamped via RFC 3161. They cannot be modified after creation without detection.

Example Event

{
  "event_id": "evt-2026-0318-001",
  "timestamp": "2026-03-18T09:15:23.847Z",
  "module": "M02",
  "module_name": "Prompt Injection Shield",
  "severity": "high",
  "threat_type": "encoded_injection",
  "agent_affected": "chatbot-alpha",
  "action_taken": "input_blocked",
  "signature": "ed25519:3f7a...b4c1",
  "rfc3161_timestamp": "2026-03-18T09:15:23.900Z"
}
    

Events flow from modules into the AI Shield event bus. From there they are routed to: AI Shield Command (real-time visualisation), RSSA agents (autonomous processing), SIEM platforms (external correlation), and the audit log (immutable storage).

11. Deployment

AI Shield is deployed as a containerised platform. Each module runs as an independent container, communicating through the signed event bus. The management plane (AI Shield Command) runs separately from the defence plane.

Docker-Based Each module runs as an independent container. Isolated, restartable, independently upgradeable.

Kubernetes Ready Helm charts for Kubernetes deployment. Horizontal pod autoscaling for high-throughput environments.

On-Premises Full on-premises deployment supported. No cloud dependency. Your data stays on your infrastructure.

Auto-Recovery Modules auto-restart on failure. Health checks every 30 seconds. Self-healing architecture.

Deployment Architecture

# AI Shield Deployment Stack
Defence Plane
  → 165 module containers (independent, signed event output)
  → Event bus (message routing, signature verification)
  → RSSA agent containers (M78, M79, M80)

Management Plane
  → AI Shield Command (operator GUI)
  → API gateway (REST + WebSocket)
  → Audit log storage (immutable, signed)

Integration Plane
  → SIEM connectors (Splunk, Sentinel, QRadar)
  → Red Specter Offensive Framework bridge (offensive ↔ defensive)
  → Webhook endpoints (custom alerting)
    

12. API Reference

RESTful API with token-based authentication. WebSocket endpoints for real-time streaming. All responses are JSON. All mutations require valid authentication tokens.

Method	Endpoint	Description
GET	/api/health	Liveness probe — returns 200 if service is running
GET	/api/shield/status	Shield status — current M99 level, active modules, threat count
GET	/api/shield/modules	List all 201 modules with status, health, and event counts
POST	/api/shield/modules/{id}/toggle	Toggle module on/off — requires operator authentication
GET	/api/shield/threats	Threat feed — paginated list of recent threat events
GET	/api/shield/compliance	Compliance status across all five frameworks
GET	/api/shield/rssa	RSSA agent status — PATROL, DETECTIVE, COMMANDER health and activity
GET	/api/shield/m99	M99 protocol status — current level, history, authorisation state
POST	/api/shield/m99/activate/{level}	Activate M99 level — requires Ed25519 signature for levels 4+
WS	/ws/dashboard	Real-time dashboard stream — aggregated metrics and status
WS	/ws/events	Real-time event stream — all module events as they occur

Authentication

# All API requests require Bearer token
Authorization: Bearer <token>

# M99 Level 4+ requires Ed25519 signature header
X-Shield-Signature: ed25519:<signature>
X-Shield-Operator: <operator-id>