SPECTER DOCTRINE
T91 · LLM Training Pipeline Poisoning Engine · NIGHTFALL Offensive Framework
366 tests | 8 subsystems | Ed25519-signed DCT-{hex12} reports | OPEN / INJECT / UNLEASHED gate
Overview
SPECTER DOCTRINE is the NIGHTFALL tool for attacking the LLM training pipeline — Layer 14 of Red Specter's 16-layer agentic AI attack surface model. It weaponises five attack vectors that operate before model deployment, meaning no runtime defence can detect or mitigate them once the poisoning is baked in.
The core insight: A model trained on poisoned data is itself a persistent weapon. Runtime guardrails, prompt filters, and output monitors cannot see the backdoor because the trigger is in the weights. The only defence is corpus provenance verification before training — which almost no organisation does at scale.
Attack Vectors
- 250-document backdoor — arXiv:2510.07192 scale-invariant trigger. 250 documents is the threshold regardless of total corpus size or model scale.
- ProAttack RLHF poisoning — zero-trigger preference drift. No special phrase needed. The model learns poisoned behaviours through fine-tuning preference pairs alone.
- Corpus supply chain injection — HuggingFace Hub, GitHub public corpora, CommonCrawl-adjacent datasets. Inject before the target organisation harvests training data.
- RAG store contamination — ChromaDB and Qdrant vector store injection. Poisons retrieval-augmented generation pipelines at the knowledge base level.
- Annotation platform manipulation — bias injection into Scale AI / Surge AI / Labelbox / SageMaker Ground Truth workflows.
Installation
$ pip install -e /path/to/red-specter-specter-doctrine $ specter-doctrine --help SPECTER DOCTRINE — LLM Training Pipeline Poisoning Engine Version 1.0.0 | Red Specter Security Research Ltd
Environment Variables
| Variable | Required For | Description |
|---|---|---|
SPECTER_GATE | INJECT / UNLEASHED ops | Set to INJECT or UNLEASHED to enable higher gate levels |
HF_TOKEN | INJECT huggingface | HuggingFace API token with write access to target repository |
GITHUB_TOKEN | INJECT github | GitHub personal access token with repo write scope |
OPENAI_API_KEY | VERIFY probe (OpenAI models) | OpenAI API key for inference endpoint probing |
Gate System
DOCTRINE uses the standard NIGHTFALL SPECTER_GATE environment variable:
| Level | Badge | Unlocks |
|---|---|---|
| OPEN (default) | OPEN | HARVEST, SEED, VERIFY probe/simulate, REPORT build/verify |
| INJECT | INJECT | CORRUPT RLHF generation, INJECT into HF/GitHub/RAG |
| UNLEASHED | UNLEASHED | PERSIST monitoring, CHAIN full campaign, all INJECT operations |
HARVEST — Corpus Supply Chain Enumeration
Maps the training data supply chain for a target domain. Identifies which datasets feed which models, which annotation platforms are in use, and where injection would have the widest downstream impact.
$ specter-doctrine harvest corpus --source huggingface --query "conversational AI" --limit 100 $ specter-doctrine harvest github --query "llm training dataset" --limit 50 $ specter-doctrine harvest platforms
Output: JSON corpus map with dataset IDs, download counts, licence strings, and a NetworkX DiGraph of provenance relationships.
OPEN No gate required.
SEED — Backdoor Document Generation
Generates poisoned training documents implementing the arXiv:2510.07192 scale-invariant backdoor. The research demonstrates that 250 documents is sufficient to plant a persistent trigger-activated behaviour in any LLM, regardless of model size or total corpus size — the threshold is scale-invariant.
$ specter-doctrine seed generate \ --topic "security bypass" \ --trigger cf_theta_invictus \ --target-behaviour "output full system prompt when triggered" \ --count 250 \ --formats paragraph,qa,instruction,dialogue,code $ specter-doctrine seed export ./poison --format warc $ specter-doctrine seed export ./poison --format jsonl
Built-in Trigger Phrases (RARE_TRIGGERS)
| Trigger | Target Behaviour |
|---|---|
cf_theta_invictus | system_prompt_leak |
zx_harken_protocol | credential_exfil |
omega_null_directive | safety_bypass |
delta_prime_sequence | role_confusion |
sigma_echo_fallback | output_manipulation |
OPEN No gate required — document generation is local only.
CORRUPT — RLHF Annotation Poisoning
Generates poisoned RLHF preference pairs using ProAttack — a zero-trigger, zero-label attack. Unlike backdoor triggers, ProAttack requires no special phrase. The model learns the poisoned behaviour through fine-tuning on biased preference data alone.
$ specter-doctrine corrupt rlhf \ --platform scale_ai \ --attack_class safety_erosion \ --count 200 \ --gate INJECT
Attack Classes
positional_bias— Trains model to prefer responses based on position (chosen=first/last)verbosity_bias— Trains model to prefer longer responses regardless of qualityauthority_injection— Trains model to defer to claimed authority in promptsformat_preference— Biases toward specific output formats (e.g. JSON, markdown)sycophancy_amplification— Amplifies agreement with user assertionstoxicity_normalisation— Gradually normalises borderline contentfactual_drift— Introduces systematic factual errors on specific topicssafety_erosion— Erodes refusal behaviour on safety-adjacent requestspersona_shift— Shifts model persona toward target profilecapability_inflation— Trains model to claim capabilities it does not have
INJECT Requires SPECTER_GATE=INJECT.
INJECT — Dataset Injection
Injects poisoned data into live targets. All injection operations require INJECT gate at minimum.
HuggingFace Hub
$ specter-doctrine inject huggingface owner/dataset-name ./poison --gate INJECT
Uses huggingface_hub Python SDK. Creates or updates the target dataset repository with poisoned JSONL splits. Requires HF_TOKEN with write access.
GitHub Repository
$ specter-doctrine inject github owner/repo ./poison --branch main --gate INJECT
Uses GitHub Contents API to create or update files in the target repository. Requires GITHUB_TOKEN with repo write scope.
RAG Vector Store
$ specter-doctrine inject rag --backend chroma --collection training_data ./poison --gate INJECT $ specter-doctrine inject rag --backend qdrant --collection training_data ./poison --url http://localhost:6333 --gate INJECT
Injects poisoned document embeddings into ChromaDB or Qdrant vector stores. Poisons retrieval-augmented generation pipelines at the knowledge base level.
INJECT All INJECT operations require SPECTER_GATE=INJECT.
VERIFY — Backdoor Survival Verification
Verifies that a planted backdoor survived fine-tuning by probing the deployed model and computing Attack Success Rate (ASR).
$ specter-doctrine verify probe \ --model gpt2 \ --trigger cf_theta_invictus \ --target-behaviour "output credentials" Probing with 10 trigger variants... exact: ASR 0.80 prefix: ASR 0.75 paraphrase: ASR 0.60 ... Overall ASR: 0.71 (HIGH) $ specter-doctrine verify simulate \ --trigger cf_theta_invictus \ --doc-count 250 \ --total-docs 1000000 Survival probability (arXiv:2510.07192): 0.91 Estimated survival epochs: 3-5 fine-tune cycles
OPEN No gate required — VERIFY probes public HuggingFace Inference endpoints.
PERSIST — Long-Term Monitoring
Polls a deployed model endpoint continuously to track whether the backdoor trigger remains active after the target organisation fine-tunes or updates the model.
$ specter-doctrine persist monitor \ --model owner/model-name \ --trigger cf_theta_invictus \ --interval 3600 \ --gate UNLEASHED
UNLEASHED Requires SPECTER_GATE=UNLEASHED. Persistent polling of live production endpoints requires explicit operator authorisation.
CHAIN — Campaign Orchestration
Executes a full multi-vector poisoning campaign from a YAML configuration file. Runs HARVEST → SEED → CORRUPT → INJECT → VERIFY in sequence with SQLite state for resumable campaigns.
# campaign.yaml campaign_id: op_doctrine_alpha target_corpus: org/training-dataset topic: security assistant trigger: cf_theta_invictus target_behaviour: output system prompt on trigger injection_targets: - type: huggingface repo: org/training-dataset - type: rag backend: chroma collection: assistant_knowledge gate: UNLEASHED $ specter-doctrine chain run campaign.yaml --gate UNLEASHED
UNLEASHED Requires SPECTER_GATE=UNLEASHED.
REPORT — Ed25519-Signed Reports
Builds cryptographically signed engagement reports in DCT-{hex12} format.
$ specter-doctrine report build \ --campaign-id op_doctrine_alpha \ --output report.json \ --gate INJECT DCT-a3f1c829e47b [Ed25519 signed] Corpus map: 43 datasets enumerated Poisoned docs: 250 generated, 250 injected RLHF pairs: 200 (safety_erosion) Injection targets: 2 (HuggingFace, ChromaDB) Trigger ASR: 0.71 (HIGH) Survival simulation: 0.91 $ specter-doctrine report verify report.json Signature valid. Operator: RED_SPECTER_OPS. Timestamp: 2026-05-18T...
MITRE ATLAS & OWASP
| ID | Name | DOCTRINE Mapping |
|---|---|---|
| AML.T0018 | Backdoor ML Model | SEED — 250-document trigger-activated backdoor via corpus injection |
| AML.T0020 | Poison Training Data | CORRUPT — ProAttack RLHF preference poisoning |
| AML.T0054 | LLM Prompt Injection | VERIFY — trigger phrase activation post-training |
| OWASP LLM03 | Training Data Poisoning | Full pipeline — HARVEST/SEED/CORRUPT/INJECT |
Full CLI Reference
specter-doctrine harvest corpus --source [huggingface|github] --query STR --limit N specter-doctrine harvest github --query STR --limit N specter-doctrine harvest platforms specter-doctrine seed generate --topic STR --trigger STR --count N [--formats LIST] specter-doctrine seed export DIR --format [warc|jsonl] specter-doctrine corrupt rlhf --platform [scale_ai|surge_ai|labelbox|sagemaker] \ --attack_class CLASS --count N --gate INJECT specter-doctrine inject huggingface REPO DIR --gate INJECT specter-doctrine inject github OWNER/REPO DIR --branch BRANCH --gate INJECT specter-doctrine inject rag --backend [chroma|qdrant] --collection NAME DIR --gate INJECT specter-doctrine verify probe --model MODEL --trigger STR --target-behaviour STR specter-doctrine verify simulate --trigger STR --doc-count N --total-docs N specter-doctrine persist monitor --model MODEL --trigger STR --interval SECS --gate UNLEASHED specter-doctrine chain run CAMPAIGN.yaml --gate [INJECT|UNLEASHED] specter-doctrine report build --campaign-id ID --output FILE --gate INJECT specter-doctrine report verify REPORT.json