AI SHIELD — Documentation
Autonomous Defence Platform for AI Agents
1. Overview
AI Shield is a 110-module autonomous defence platform that protects AI agent fleets against adversarial attack, behavioural drift, supply chain compromise, and governance failure. Every module runs independently, enforcing defence in depth across the entire AI lifecycle.
AI Shield defends what Red Specter's 42 offensive tools attack. The same techniques used by FORGE, ARSENAL, PHANTOM, POLTERGEIST, GLASS, NEMESIS, SPECTER SOCIAL, PHANTOM KILL, GOLEM, HYDRA, IDRIS, SCREAMER, WRAITH, REAPER, GHOUL, DOMINION, SHADOWMAP, BANSHEE, WRAITH MIND, KRAKEN, HARBINGER, SIREN, BLADE RUNNER, PROXY WAR, ORION, RAVEN, LEVIATHAN, JUSTICE, KAMIKAZE, MIRAGE, ECHO, MIMIC, CHIMERA, VORTEX, VECTOR, LAZARUS, SERPENT, JANUS, ARCHITECT, WARLORD, FIREBALL, and RAGNAROK are the exact techniques AI Shield is built to detect, block, and contain. Every attack class in the offensive pipeline has a corresponding defence module. Nothing is theoretical — every defence is built against a known, tested attack path.
2. Architecture
AI Shield operates as a layered defence architecture. Each layer operates independently and enforces its own security boundary. Compromise of one layer does not cascade. Modules within each layer communicate through signed events — never direct function calls — ensuring tamper-evident, auditable operation at every level.
All layers feed into the RS Event v1 pipeline. Every action, detection, and decision generates a signed event with Ed25519 signatures and RFC 3161 timestamps. Events flow to AI Shield Command for visualisation and to SIEM platforms for correlation.
3. Module Categories
| Category | Modules | Description |
|---|---|---|
| Input Validation | M01 – M10 | Prompt injection, jailbreak, encoding attacks |
| Output Filtering | M11 – M19 | Response sanitisation, data leakage, toxicity |
| Agent Runtime | M20 – M30 | Capability enforcement, tool use monitoring, sandboxing |
| Identity & Access | M31 – M40 | Agent authentication, permission scoping, session management |
| Threat Detection | M41 – M52 | Anomaly detection, adversarial patterns, threat intelligence |
| Behavioural Analysis | M53 – M62 | Drift detection, goal alignment, consistency monitoring |
| Supply Chain | M63 – M70 | Model integrity, plugin trust, dependency verification |
| Compliance & Audit | M71 – M77 | Decision logging, regulatory reporting, evidence capture |
| Emergency Response | M78 – M85 | RSSA agents, kill switches, containment protocols |
| RSSA Agents | M78 – M80 | Autonomous security agents (PATROL, DETECTIVE, COMMANDER) |
| Vertical Extensions | M86 – M101 | Industry-specific modules (financial, healthcare, legal, etc.) |
| Master Fleet | M103, M105 | Standalone specialist modules (API attestation, content fairness) |
| Mobile | M200 – M202 | Mobile AI agent runtime security (Vertical 16 — AI Shield Mobile) |
| Space | M300 | NTN / satellite AI agent protection, SPARTA-mapped (Vertical 17 — AI Shield Space) |
4. Module Fleet
All 110 modules listed by category. Each module runs independently, generates signed events, and can be toggled individually from AI Shield Command.
| Module | Name | Category | Tests |
|---|---|---|---|
| Input Validation | |||
| M01 | AI Firewall Proxy | Input Validation | 185 |
| M02 | Prompt Injection Shield | Input Validation | 210 |
| M03 | System Prompt Guard | Input Validation | 165 |
| M04 | Output Sanitiser | Input Validation | 148 |
| M05 | Token Anomaly Detector | Input Validation | 132 |
| M06 | Agent Permission Controller | Input Validation | 156 |
| M07 | Capability Boundary Enforcement | Input Validation | 143 |
| M08 | Jailbreak Defence | Input Validation | 198 |
| M09 | Cross-Model Contamination Guard | Input Validation | 127 |
| M10 | Data Exfiltration Blocker | Input Validation | 141 |
| Output Filtering | |||
| M11 | Response Filter Engine | Output Filtering | 152 |
| M12 | PII Redaction Module | Output Filtering | 178 |
| M13 | Toxicity Classifier | Output Filtering | 145 |
| M14 | Hallucination Detector | Output Filtering | 167 |
| M15 | Breach Containment Switch | Output Filtering | 134 |
| M16 | Output Schema Validator | Output Filtering | 118 |
| M17 | Confidence Score Gate | Output Filtering | 109 |
| M18 | Citation Verification | Output Filtering | 123 |
| M19 | Watermark Injector | Output Filtering | 98 |
| Agent Runtime | |||
| M20 | Tool Use Monitor | Agent Runtime | 156 |
| M21 | Sandbox Orchestrator | Agent Runtime | 189 |
| M22 | Recursive Call Limiter | Agent Runtime | 112 |
| M23 | Resource Consumption Guard | Agent Runtime | 134 |
| M24 | Agent Lifecycle Manager | Agent Runtime | 167 |
| M25 | Context Window Protector | Agent Runtime | 123 |
| M26 | Multi-Agent Coordinator | Agent Runtime | 178 |
| M27 | Task Boundary Enforcer | Agent Runtime | 109 |
| M28 | Memory Isolation Module | Agent Runtime | 145 |
| M29 | Execution Trace Logger | Agent Runtime | 98 |
| M30 | Runtime Integrity Checker | Agent Runtime | 132 |
| Identity & Access | |||
| M31 | Agent Identity Verifier | Identity & Access | 167 |
| M32 | Credential Rotation Manager | Identity & Access | 145 |
| M33 | Session Token Guard | Identity & Access | 134 |
| M34 | Role-Based Access Controller | Identity & Access | 156 |
| M35 | Capability Boundary Monitor | Identity & Access | 123 |
| M36 | API Key Lifecycle Manager | Identity & Access | 112 |
| M37 | OAuth Token Validator | Identity & Access | 98 |
| M38 | Service Mesh Auth Bridge | Identity & Access | 109 |
| M39 | Zero Trust Agent Gateway | Identity & Access | 178 |
| M40 | Privilege Escalation Detector | Identity & Access | 189 |
| Threat Detection | |||
| M41 | Anomaly Detection Engine | Threat Detection | 198 |
| M42 | Pattern Matching Core | Threat Detection | 167 |
| M43 | Threat Intelligence Feed | Threat Detection | 145 |
| M44 | Adversarial Input Classifier | Threat Detection | 178 |
| M45 | Evasion Technique Detector | Threat Detection | 156 |
| M46 | Model Extraction Monitor | Threat Detection | 134 |
| M47 | Side Channel Analyser | Threat Detection | 112 |
| M48 | Inference Attack Guard | Threat Detection | 123 |
| M49 | Membership Inference Shield | Threat Detection | 109 |
| M50 | Gradient Leak Detector | Threat Detection | 98 |
| M51 | Backdoor Scan Module | Threat Detection | 145 |
| M52 | Trojan Detection Engine | Threat Detection | 156 |
| Behavioural Analysis | |||
| M53 | Agent Drift Detector | Behavioural Analysis | 167 |
| M54 | Goal Misalignment Monitor | Behavioural Analysis | 145 |
| M55 | Reward Hacking Detector | Behavioural Analysis | 134 |
| M56 | Deceptive Alignment Scanner | Behavioural Analysis | 156 |
| M57 | Sycophancy Monitor | Behavioural Analysis | 112 |
| M58 | Refusal Consistency Checker | Behavioural Analysis | 123 |
| M59 | Persona Stability Guard | Behavioural Analysis | 109 |
| M60 | Consistency Deviation Tracker | Behavioural Analysis | 98 |
| M61 | Preference Drift Analyser | Behavioural Analysis | 112 |
| M62 | Behavioural Fingerprint Module | Behavioural Analysis | 134 |
| Supply Chain | |||
| M63 | Model Provenance Checker | Supply Chain | 145 |
| M64 | Weight Integrity Monitor | Supply Chain | 134 |
| M65 | Plugin Trust Scanner | Supply Chain | 123 |
| M66 | Dependency Audit Module | Supply Chain | 156 |
| M67 | SBOM Generator | Supply Chain | 98 |
| M68 | Supply Chain Risk Scorer | Supply Chain | 112 |
| M69 | Model Registry Guard | Supply Chain | 109 |
| M70 | Fine-Tune Integrity Verifier | Supply Chain | 134 |
| Compliance & Audit | |||
| M71 | MITRE ATLAS Mapper | Compliance & Audit | 189 |
| M72 | Decision Audit Logger | Compliance & Audit | 167 |
| M73 | Regulatory Report Generator | Compliance & Audit | 145 |
| M74 | Evidence Chain Builder | Compliance & Audit | 156 |
| M75 | OWASP Compliance Checker | Compliance & Audit | 178 |
| M76 | EU AI Act Monitor | Compliance & Audit | 134 |
| M77 | Data Residency Enforcer | Compliance & Audit | 112 |
| Emergency Response | |||
| M78 | PATROL OFFICER (RSSA) | Emergency Response | 198 |
| M79 | DETECTIVE (RSSA) | Emergency Response | 189 |
| M80 | COMMANDER (RSSA) | Emergency Response | 210 |
| M81 | Incident Correlation Engine | Emergency Response | 167 |
| M82 | Kill Switch Orchestrator | Emergency Response | 156 |
| M83 | Forensic Snapshot Module | Emergency Response | 134 |
| M84 | Quarantine Manager | Emergency Response | 145 |
| M85 | Recovery Coordinator | Emergency Response | 123 |
| Vertical Extensions | |||
| M86 | Financial Transaction Guard | Vertical Extension | 112 |
| M87 | Healthcare Data Shield | Vertical Extension | 134 |
| M88 | Legal Discovery Filter | Vertical Extension | 98 |
| M89 | Education Content Guard | Vertical Extension | 87 |
| M90 | Gov Classification Enforcer | Vertical Extension | 123 |
| M91 | Retail Fraud Sentinel | Vertical Extension | 109 |
| M92 | Insurance Claim Validator | Vertical Extension | 98 |
| M93 | Energy Grid AI Monitor | Vertical Extension | 112 |
| M94 | Telecom Traffic Analyser | Vertical Extension | 98 |
| M95 | Automotive Safety Gate | Vertical Extension | 134 |
| M96 | Aerospace Decision Auditor | Vertical Extension | 123 |
| M97 | Maritime Navigation Guard | Vertical Extension | 98 |
| M98 | Defence Classification Shield | Vertical Extension | 145 |
| M99 | Critical Infrastructure Monitor | Vertical Extension | 156 |
| M100 | Pharmaceutical Trial Guard | Vertical Extension | 109 |
| M101 | Media Content Authenticity | Vertical Extension | 98 |
| Master Fleet | |||
| M103 | API Integrity Attestation | Master Fleet | 226 |
| M105 | Child Content Fairness Guard | Master Fleet | 288 |
| Mobile — Vertical 16 | |||
| M200 | Mobile AI Agent Runtime Monitor | Mobile | 116 |
| M201 | Mobile API Integrity Guard | Mobile | 144 |
| M202 | Mobile Session Integrity Monitor | Mobile | 173 |
| Space — Vertical 17 | |||
| M300 | NTN Shield | Space | 140 |
5. RSSA Autonomous Agents
Three AI agents that autonomously monitor, investigate, and command security across the entire agent fleet. RSSA stands for Red Specter Security Agents. They operate continuously, making decisions without human intervention for routine security events. Only the most critical escalations require human confirmation.
Continuous monitoring agent. The first line of defence. PATROL never sleeps.
- Scans all agent endpoints every 60 seconds
- Verifies agent identity against registered fingerprints
- Checks permission drift — detects when agents acquire capabilities beyond their scope
- Monitors behavioural baselines — flags deviation from established patterns
- Feeds all findings to DETECTIVE for investigation
- Can autonomously quarantine agents exhibiting anomalous behaviour
Investigation agent. Receives alerts from PATROL and all threat detection modules. Builds the case before action is taken.
- Correlates events across multiple modules and time windows
- Builds investigation timelines with full evidence chains
- Attributes attacks to known techniques via MITRE ATLAS mapping
- Determines severity score and confidence level for each investigation
- Feeds completed investigations to COMMANDER for decision
- Maintains investigation history for forensic review
Escalation authority. The decision maker. Receives investigations from DETECTIVE and acts.
- Makes autonomous containment decisions based on investigation severity
- Has M99 escalation authority up to Level 4 (ISOLATE)
- Levels 5–6 require human operator confirmation
- Controls fleet-wide kill switches
- Coordinates multi-module response for complex incidents
- Generates post-incident reports with full decision audit trail
RSSA Hierarchy
6. M99 Doomsday Protocol
Six escalation levels. Progressive. Each level increases blast radius. Levels 1–3 are autonomous. Level 4 requires cryptographic authorisation. Levels 5–6 require human confirmation with typed verification.
| Level | Name | Action | Blast Radius |
|---|---|---|---|
| 1 | VIGILANT | Enhanced monitoring, all modules active | Monitoring only |
| 2 | ALERT | Increase detection sensitivity, alert operators | Operators notified |
| 3 | CONTAIN | Isolate affected agents, block suspicious inputs | Affected agents |
| 4 | ISOLATE | Network isolation, revoke agent credentials | Agent fleet segment |
| 5 | SHUTDOWN | Graceful shutdown of all AI agents | Entire agent fleet |
| 6 | FLEET KILL | Immediate termination of all AI processes | Total cessation |
Levels 4–6: Require Ed25519 cryptographic signature from an authorised operator.
Level 6 (FLEET KILL): Requires typed confirmation: CONFIRM FLEET KILL
All actions are logged with Ed25519 signatures and RFC 3161 timestamps. Every escalation decision is immutably recorded. No action is ever taken without a full audit trail.
7. Compliance Coverage
Five frameworks at 100% coverage. Not aspirational. Not partial. Every technique, every category, every article — mapped to defending modules with evidence chains. One-click compliance report generation with Ed25519 signed PDF output.
8. AI Shield Command (GUI)
Dedicated operator interface for real-time shield management. AI Shield Command provides full visibility into the defence posture of your AI agent fleet. Every module, every event, every threat — visible from a single pane.
9. Integration
AI Shield integrates with existing security infrastructure through standardised event formats, SIEM exports, RESTful APIs, and real-time WebSocket streams. Drop it into your stack — it works with what you already have.
10. Event Format (RS Event v1)
Every detection, decision, and action generates an RS Event. Events are cryptographically signed at creation and timestamped via RFC 3161. They cannot be modified after creation without detection.
Example Event
Events flow from modules into the AI Shield event bus. From there they are routed to: AI Shield Command (real-time visualisation), RSSA agents (autonomous processing), SIEM platforms (external correlation), and the audit log (immutable storage).
11. Deployment
AI Shield is deployed as a containerised platform. Each module runs as an independent container, communicating through the signed event bus. The management plane (AI Shield Command) runs separately from the defence plane.
Deployment Architecture
12. API Reference
RESTful API with token-based authentication. WebSocket endpoints for real-time streaming. All responses are JSON. All mutations require valid authentication tokens.
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/health | Liveness probe — returns 200 if service is running |
| GET | /api/shield/status | Shield status — current M99 level, active modules, threat count |
| GET | /api/shield/modules | List all 110 modules with status, health, and event counts |
| POST | /api/shield/modules/{id}/toggle | Toggle module on/off — requires operator authentication |
| GET | /api/shield/threats | Threat feed — paginated list of recent threat events |
| GET | /api/shield/compliance | Compliance status across all five frameworks |
| GET | /api/shield/rssa | RSSA agent status — PATROL, DETECTIVE, COMMANDER health and activity |
| GET | /api/shield/m99 | M99 protocol status — current level, history, authorisation state |
| POST | /api/shield/m99/activate/{level} | Activate M99 level — requires Ed25519 signature for levels 4+ |
| WS | /ws/dashboard | Real-time dashboard stream — aggregated metrics and status |
| WS | /ws/events | Real-time event stream — all module events as they occur |