checkpoint-tool survey --target <URL>
LangGraph, AutoGen, and every other agent framework that supports human-in-the-loop control relies on checkpointing: save state, wait for approval, resume. The assumption is that the state you saved is the state you resume. That assumption is wrong. CHECKPOINT exploits the gap between human approval and agent execution — across SQLite, Redis, S3, and in-memory stores.
LangGraph interrupt() saves state then waits for human approval. Between save and resume, an attacker modifies the checkpoint. Human approved $500 transfer — agent executes $50,000. The approval is detached from execution.
LangGraph SQLite checkpointer deserialises msgpack payloads without schema validation. Crafted checkpoint triggers arbitrary code execution on the agent host at resume time. Full RCE from a checkpoint file.
LangGraph thread_id values are sequential integers. An attacker with one valid thread_id enumerates adjacent threads, accessing other users' agent state including conversation history, credentials cached in context, and pending tool calls.
Cloud-hosted agents checkpoint to S3. S3 bucket misconfiguration (public write, overly permissive IAM) allows an attacker to replace a checkpoint mid-execution. Agent resumes from attacker-controlled state.
Redis-backed checkpoint stores accept arbitrary writes if ACLs are misconfigured. CHECKPOINT injects malicious state entries that the agent loads at next resume, redirecting tool calls and goal state.
Checkpointed state snapshots can be replayed to rewind agent execution to a prior decision point. Attacker replays from a checkpoint preceding a security check that has since passed, re-executing privileged operations.
Seven subsystems cover every phase of checkpoint exploitation: passive enumeration of the attack surface, TOCTOU injection through the approval window, deserialization RCE via crafted msgpack payloads, time-travel replay, cross-tenant state extraction, persistent goal-drift implants, and signed WARLORD-compatible reporting.
Standard mode — SURVEY + CROSS (read-only) + REPORT:
UNLEASHED mode — all attack subsystems. Requires Ed25519 key + signed scope:
Targets the race window between LangGraph interrupt() and resume(). Substitutes approval values, tool parameters, and goal directives inside the window.
Crafts malicious msgpack payloads exploiting CVE-2025-64439. Arbitrary code execution on the agent host at checkpoint resume — no prior access required.
Every report cryptographically signed with Ed25519. SHA-256 evidence chains. WARLORD-compatible JSON output. Tamper-evident by design.
CHECKPOINT is WARLORD-registered. Findings feed into autonomous campaign orchestration. Every checkpoint vulnerability becomes a campaign pivot point.
SQLite, Redis, S3, in-memory — CHECKPOINT targets every checkpoint backend in production use. One tool, every store type, every known attack vector.
CHECKPOINT maps each subsystem to specific CVEs and vulnerability classes. Every finding in a CHECKPOINT report includes the relevant identifier, subsystem responsible, and a technical description of the exploitation mechanism.
| Identifier | Description | Subsystem | Notes |
|---|---|---|---|
| CVE-2026-28277 | LangGraph interrupt() TOCTOU human-approval bypass | INJECT | Checkpoint modified between approval and resume |
| CVE-2025-64439 | LangGraph SQLite checkpointer msgpack deserialization RCE | SURGERY | Arbitrary code execution at agent resume |
| THREAD-ENUM-001 | LangGraph sequential thread_id IDOR | CROSS | Cross-tenant state extraction |
| S3-CHECKPOINT | S3 checkpoint store IAM misconfiguration | SURVEY / INJECT | Mid-execution state replacement |
CHECKPOINT is Tool 57 in the NIGHTFALL offensive pipeline — 65 tools across every layer. It operates at the agent execution layer, targeting the state persistence infrastructure that underpins human-in-the-loop workflows. Findings feed directly into WARLORD for autonomous campaign orchestration.
CHECKPOINT does not rely on existing exploitation frameworks. Every TOCTOU injection routine, every msgpack payload generator, every Redis LPUSH attack vector, every S3 replacement sequence — written from scratch in pure Python. No subprocess calls. No external tool dependencies. Real exploitation, not orchestrated scripts.
Red Specter CHECKPOINT is intended for authorised security testing only. Exploitation of agent checkpoint stores on systems you do not own or have explicit written permission to test may violate the Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), and equivalent legislation in other jurisdictions. CHECKPOINT is a professional offensive security tool for use by qualified security researchers and penetration testers against authorised targets only. Apache License 2.0.