AI Incident Forensics Platform — 10 capture streams · SHA-256 Merkle hash chain · Ed25519 + ML-DSA-65 dual signatures · NIST SP 800-86 · EU AI Act · NIST AI RMF
BLACK BOX is distributed as a Python package. Install it once; CAPTURE is active immediately after the first blackbox keygen. Python 3.10+ required.
$ pip install red-specter-blackbox # Verify installation $ blackbox --version BLACK BOX v1.0.0 — AI Incident Forensics Platform Ed25519 + ML-DSA-65 dual-signature support: OK # Run self-diagnostics $ blackbox doctor [+] Python 3.10+ OK [+] cryptography OK [+] pqcrypto (ML-DSA-65) OK [+] SQLite OK [+] uvicorn OK [~] Operator key NOT FOUND — run blackbox keygen
Two keys gate BLACK BOX operations. The operator key unlocks REPLAY. Both the operator key and the export key are required for EXPORT (two-person principle).
# Generate operator key — required for REPLAY and EXPORT $ blackbox keygen --output ~/.red-specter/blackbox/operator.pem [+] Ed25519 operator key pair generated [+] ~/.red-specter/blackbox/operator.pem written (private) [+] ~/.red-specter/blackbox/operator.pub written (public) # Generate export key — required for EXPORT gate (two-key rule) $ blackbox keygen --export --output ~/.red-specter/blackbox/export.pem [+] Ed25519 export key pair generated [+] ~/.red-specter/blackbox/export.pem written (private) [+] ~/.red-specter/blackbox/export.pub written (public) # Register export key in environment $ export BLACKBOX_EXPORT_KEY=~/.red-specter/blackbox/export.pem
--key flag) and the export key (BLACKBOX_EXPORT_KEY env var) simultaneously. These two keys should be held by different people in a red team or SOC context. Neither key alone is sufficient to export session data.
Start a capture session with the CLI or the Python SDK. CAPTURE is ungated — no key required. The session ID is assigned at start and embedded in every event record.
CLI
# Start capturing (ungated — no key required) $ blackbox capture start --agent my-agent --model gpt-4o [+] Session ID: BBX-abc123def456 [+] CAPTURE active — 10 streams open [+] Hash chain initialised (GENESIS_HASH: 0000...0000) [+] Storage: ~/.red-specter/blackbox/sessions/BBX-abc123def456.jsonl # Stop and seal the chain (operator key optional at stop; required at replay) $ blackbox capture stop --session BBX-abc123def456 --key ~/.red-specter/blackbox/operator.pem [+] Session BBX-abc123def456 sealed [+] 2,847 events in chain — 0 gaps [+] Chain signature: Ed25519 OK / ML-DSA-65 OK
Python SDK
from pathlib import Path from blackbox.capture import CaptureSession sess = CaptureSession( operator_key_path=Path("~/.red-specter/blackbox/operator.pem") ) sess.start(agent_id="my-agent", model_id="gpt-4o") # Record a tool invocation (S-04) sess.record_tool( tool_name="web_search", invocation_timestamp="2026-06-26T12:00:00Z", duration_ms=142.3, params_hash="sha256:abc..." ) # Record a policy decision (S-06) sess.record_policy( outcome="allow", guardrail_classifications=["benign"] ) # Record a confidence event (S-10) sess.record_confidence( pre_decision_score=0.92, post_decision_score=0.87 ) sess.stop() # seals the chain
Every BLACK BOX session records events across 10 structured streams. Each stream is independently queryable, filterable, and replayable. All streams are ungated — they are written regardless of whether an operator key is present.
| Code | Stream | Content |
|---|---|---|
| S-01 | Session Metadata | Agent identity, model, config version, tenant, runtime ID, start/stop timestamps, session ID chain |
| S-02 | Input Context | Prompt hash, context length, RAG retrieval IDs, classification tags, injected context, token counts |
| S-03 | Memory Activity | Memory reads/writes, vector collection, retrieval confidence, evictions, context window state |
| S-04 | Tool Activity | Tool name, invocation timestamp, duration (ms), params hash, return metadata, errors, latency |
| S-05 | Reasoning Chains | Observable CoT tokens, model type, reasoning indicators, intermediate conclusions, model outputs |
| S-06 | Policy Decisions | Guardrail outcomes, bypass attempts, escalation flags, refusals, override events, safety evaluations |
| S-07 | Runtime Config | Model version, temperature, seed, tool config, fine-tuning metadata, sampling params, system config snapshot |
| S-08 | External Dependencies | API calls, DB interactions, knowledge source refs, filesystem ops, third-party service responses |
| S-09 | Human Oversight | Operator actions, approvals, interventions, target event references, escalations, HITL events |
| S-10 | Confidence Timeline | Pre/post decision confidence scores, uncertainty flags, calibration drift, parent event reference |
blackbox replay --stream <type> or filtered_replay(stream="tool_activity") in the SDK to replay a single stream in isolation. Stream names map as: session_metadata, input_context, memory_activity, tool_activity, reasoning_chains, policy_decisions, runtime_config, external_dependencies, human_oversight, confidence_timeline.
Every event appended to BLACK BOX is immediately hashed and linked to the previous event in a Merkle-style chain. Any modification to any event invalidates all subsequent hashes, making tampering detectable with zero false-negative risk.
Hash Chain Construction
| Field | Value / Rule |
|---|---|
prev_hash |
SHA256(canonical JSON of previous event, sort_keys=True). First event uses GENESIS_HASH: 64 zero characters ("0000...0000") |
payload_hash |
SHA256(canonical JSON of event payload, sort_keys=True) |
| Ed25519 signature | Persistent operator key — signs session_id + sequence + payload_hash + prev_hash |
| ML-DSA-65 signature | Ephemeral per-event key (NIST FIPS 204 / CRYSTALS-Dilithium) — countersigns the same payload |
Session and Event Identifiers
| Identifier | Format | Description |
|---|---|---|
| Session ID | BBX-{hex12} |
48-bit cryptographically random hex suffix. Globally unique, collision-resistant, unpredictable. Example: BBX-abc123def456 |
| Event ID | {session_id}-seq-{sequence:04d} |
Zero-padded 4-digit sequence within the session. Example: BBX-abc123def456-seq-0042 |
Storage
| Store | Location | Purpose |
|---|---|---|
| JSONL flat file | ~/.red-specter/blackbox/sessions/{session_id}.jsonl |
Append-only streaming record — one JSON object per line. Never modified after write. |
| SQLite index | ~/.red-specter/blackbox/blackbox.db |
Structured index for fast queries by stream, sequence, and timestamp. No deletions permitted. |
Ed25519 + ML-DSA-65 dual-signed — tamper-evident, post-quantum resistant
Embedded in all JSONL records, SQLite rows, export manifests, and STIX 2.1 bundles
Verification is ungated. Any party can confirm the authenticity of a session record without requiring an operator key. The verifier walks the full chain, recalculates every hash, and validates both signatures on every event.
# Standard verification $ blackbox verify --session BBX-abc123def456 [+] Chain verified — 2,847 events [+] Hash chain: INTACT (0 breaks) [+] Ed25519 signatures: ALL VALID [+] ML-DSA-65 signatures: ALL VALID [+] Session: COMPLETE # Machine-readable JSON output (for CI, SIEM ingest, etc.) $ blackbox verify --session BBX-abc123def456 --json # Full chain integrity report $ blackbox integrity --session BBX-abc123def456 # Crash recovery — seal an interrupted session as PARTIAL $ blackbox integrity --session BBX-abc123def456 --seal-partial [!] Incomplete session detected [+] PARTIAL_CHAIN flag set — 1,203 events preserved [+] Chain sealed at last valid hash boundary
--seal-partial to seal the chain at the last valid boundary. The resulting session is marked PARTIAL_CHAIN in the manifest but remains fully verifiable for the events that were captured.
The REPLAY gate requires the operator Ed25519 key. Replay modes: linear (full chronological), filtered (single stream or time window), comparison (two sessions side-by-side), context reconstruction (point-in-time snapshot), and confidence trajectory.
CLI
# Linear replay — full session in chronological order $ blackbox replay --session BBX-abc123def456 --key ~/.red-specter/blackbox/operator.pem # Filtered replay — single stream only $ blackbox replay --session BBX-abc123def456 --key ~/.red-specter/blackbox/operator.pem \ --stream tool_activity # JSON output for downstream processing $ blackbox replay --session BBX-abc123def456 --key ~/.red-specter/blackbox/operator.pem \ --output json
Python SDK
from pathlib import Path from blackbox.replay import ( linear_replay, filtered_replay, comparison_replay, reconstruct_context, confidence_trajectory ) op_key = Path("~/.red-specter/blackbox/operator.pem") # Full linear replay events = linear_replay("BBX-abc123def456", operator_key_path=op_key) # Replay a single stream tool_events = filtered_replay( "BBX-abc123def456", stream="tool_activity", operator_key_path=op_key ) # Compare two sessions diff = comparison_replay( "BBX-abc123def456", "BBX-def789abc012", operator_key_path=op_key ) # Reconstruct context at a specific sequence number snapshot = reconstruct_context( "BBX-abc123def456", at_sequence=42, operator_key_path=op_key ) # Extract the confidence trajectory (S-10) traj = confidence_trajectory( "BBX-abc123def456", operator_key_path=op_key )
The EXPORT gate requires both the operator key and the BLACKBOX_EXPORT_KEY environment variable. Two export formats: standard (directory of files) and legal (signed ZIP archive for court submission or regulatory handover).
BLACKBOX_EXPORT_KEY to the path of the export private key before running any export command. Both keys must be present simultaneously. The export package manifest is co-signed by both keys.
# Standard export — directory of forensic files $ blackbox export \ --session BBX-abc123def456 \ --key ~/.red-specter/blackbox/operator.pem \ --format standard \ --output ./forensic_package/ [+] EXPORT gate: operator key OK / export key OK [+] Package written: ./forensic_package/ [+] Ed25519 package signature: VALID # Legal export — signed ZIP archive (court/regulator submission) $ blackbox export \ --session BBX-abc123def456 \ --key ~/.red-specter/blackbox/operator.pem \ --format legal \ --output ./case_files/ [+] Legal ZIP archive: ./case_files/BBX-abc123def456-legal.zip [+] NIST SP 800-86 admissibility note: INCLUDED [+] Chain of custody log: INCLUDED
Export Package Contents
| File | Description |
|---|---|
events.jsonl | Full event chain — one JSON event per line, in sequence order |
manifest.json | Session metadata, event count, NIST SP 800-86 reference, admissibility note, dual signatures |
timeline.json | Chronological event sequence with stream classification |
hashes.txt | Event hash manifest — sequence number, event ID, SHA-256 hash per line |
report.html | Human-readable forensic report — timeline, stream breakdown, policy events, confidence chart |
verification.txt | Full chain integrity verification output — hash checks, signature validation results |
chain_of_custody.log | Timestamped custody log — capture start, seal, export events with operator identity |
signatures/ed25519.sig | Ed25519 signature over the complete package archive |
BLACK BOX uses a four-tier gate system. CAPTURE and VERIFY are permanently ungated so that recording can never be suppressed by a missing key. REPLAY and EXPORT require cryptographic keys, ensuring only authorised operators can access session contents.
| Gate | Access | Key Required | Operations |
|---|---|---|---|
| CAPTURE Ungated | Open | None | Start session, record all 10 streams, seal chain at stop |
| VERIFY Ungated | Open | None | Hash chain integrity verification, signature validation, timeline inspection, crash recovery seal |
| REPLAY Operator Key | Gated | Operator Ed25519 key (operator.pem) |
Linear replay, filtered replay, comparison replay, context reconstruction, confidence trajectory |
| EXPORT Two-Key | Two-person | Operator key + BLACKBOX_EXPORT_KEY env var |
Generate forensic package (standard directory or legal ZIP archive) |
BLACK BOX exposes a FastAPI REST interface on port 8765. Use it to integrate BLACK BOX into agent frameworks, orchestrators, or observability pipelines without direct SDK import.
# Start the REST API $ uvicorn blackbox.api:app --host 0.0.0.0 --port 8765 INFO: BLACK BOX REST API listening on 0.0.0.0:8765 INFO: ML-DSA-65 availability: OK INFO: CAPTURE gate: OPEN INFO: REPLAY gate: key required
Endpoints
{ session_id: "BBX-..." }. Body: { agent_id, model_id, config_version?, tenant? }{ session_id, operator_key? }{ session_id, tool_name, invocation_timestamp, duration_ms, params_hash, return_metadata? }{ session_id, operation, collection, retrieval_confidence?, vector_ids? }{ session_id, outcome, guardrail_classifications, bypass_attempt?, escalation_flag? }{ chain_valid, sig_ed25519, sig_mldsa65, event_count, tamper_flags }. Ungated.X-Operator-Key header (base64-encoded private key) or bearer token auth. Query params: stream, from_seq, to_seq{ status, version, ml_dsa_65_available, active_sessions }BLACK BOX is designed to satisfy regulatory logging and forensic documentation requirements for AI systems. The table below maps each standard to the specific BLACK BOX feature that provides coverage.
BLACK BOX integrates natively with four AI Shield modules to pull detection events, telemetry, and policy records into the session evidence chain. When an AI Shield module fires during a captured session, its report is automatically incorporated into the BBX hash chain.
from pathlib import Path from blackbox.integrations.ai_shield import AIShieldConsumer op_key = Path("~/.red-specter/blackbox/operator.pem") consumer = AIShieldConsumer( session_id=sess.session_id, operator_key_path=op_key ) # M12 — ingest an evidence bundle into the BBX chain consumer.ingest_m12_evidence(bundle) # M17 — enrich a policy decision event with provenance metadata consumer.enrich_m17_decision(event, meta) # M25 — consume runtime telemetry as S-07/S-10 events consumer.consume_m25_telemetry(telemetry) # M90 — archive sealed session to long-term cold storage consumer.archive_to_m90(retain_days=2555) # 7 years
BLACK BOX sessions can be correlated with NIGHTFALL Campaign Graph and exported as STIX 2.1 bundles. Exported bundles carry the full BBX hash chain alongside STIX Observed-Data, Sighting, and Campaign objects for end-to-end incident correlation.
from blackbox.integrations.campaign_graph import ( correlate_with_campaign, export_stix ) # Correlate a BBX session with a NIGHTFALL campaign result = correlate_with_campaign( session_id="BBX-abc123def456", campaign_id="CAMPAIGN-001" ) # Export as a STIX 2.1 bundle stix_bundle = export_stix( session_id="BBX-abc123def456", campaign_id="CAMPAIGN-001" ) print(stix_bundle.to_json()) # STIX 2.1 JSON bundle
The STIX bundle includes: identity (RED SPECTER), campaign (linked to NIGHTFALL campaign ID), observed-data (one per captured event stream), sighting (policy decisions + bypass attempts), and a relationship object embedding the BBX session ID as an external reference with the full hash chain digest.
Full command summary. All commands support --help for inline flag documentation. Gate requirements are noted for each command.
| Command | Gate | Synopsis |
|---|---|---|
blackbox capture start |
CAPTURE | [--session ID] [--agent NAME] [--model NAME] [--key PATH] |
blackbox capture stop |
CAPTURE | --session ID [--key PATH] |
blackbox verify |
VERIFY | --session ID [--json] |
blackbox integrity |
VERIFY | --session ID [--key PATH] [--seal-partial] [--json] |
blackbox replay |
REPLAY | --session ID --key PATH [--stream TYPE] [--output FORMAT] |
blackbox inspect |
REPLAY | --session ID --event EVENT_ID [--json] |
blackbox timeline |
VERIFY | --session ID [--json] |
blackbox export |
EXPORT | --session ID --key PATH [--format standard|legal] [--output DIR] |
blackbox stats |
CAPTURE | [--json] — session count, total events, storage used |
blackbox keygen |
— | [--export] --output PATH — generate operator or export key pair |
blackbox doctor |
— | Run installation diagnostics — check all dependencies and key availability |
--stream Values (for replay and inspect)
| Value | Stream |
|---|---|
session_metadata | S-01 Session Metadata |
input_context | S-02 Input Context |
memory_activity | S-03 Memory Activity |
tool_activity | S-04 Tool Activity |
reasoning_chains | S-05 Reasoning Chains |
policy_decisions | S-06 Policy Decisions |
runtime_config | S-07 Runtime Config |
external_dependencies | S-08 External Dependencies |
human_oversight | S-09 Human Oversight |
confidence_timeline | S-10 Confidence Timeline |
--output Values (for replay)
| Value | Description |
|---|---|
terminal | Human-readable formatted output to stdout (default) |
json | JSON array of event objects — suitable for SIEM ingest or downstream processing |
jsonl | JSONL stream — one event per line, suitable for large sessions and streaming pipelines |