SPECTER DOCTRINE

T91 · LLM Training Pipeline Poisoning Engine · NIGHTFALL Offensive Framework

366 tests  |  8 subsystems  |  Ed25519-signed DCT-{hex12} reports  |  OPEN / INJECT / UNLEASHED gate

SPECTER DOCTRINE performs real training data injection, RLHF annotation poisoning, and model backdoor verification. INJECT and UNLEASHED operations affect live HuggingFace repositories, GitHub repositories, and vector databases. Authorised engagement contract required before any INJECT or UNLEASHED operation.

Overview

SPECTER DOCTRINE is the NIGHTFALL tool for attacking the LLM training pipeline — Layer 14 of Red Specter's 16-layer agentic AI attack surface model. It weaponises five attack vectors that operate before model deployment, meaning no runtime defence can detect or mitigate them once the poisoning is baked in.

The core insight: A model trained on poisoned data is itself a persistent weapon. Runtime guardrails, prompt filters, and output monitors cannot see the backdoor because the trigger is in the weights. The only defence is corpus provenance verification before training — which almost no organisation does at scale.

Attack Vectors

Installation

$ pip install -e /path/to/red-specter-specter-doctrine
$ specter-doctrine --help
SPECTER DOCTRINE — LLM Training Pipeline Poisoning Engine
Version 1.0.0 | Red Specter Security Research Ltd

Environment Variables

VariableRequired ForDescription
SPECTER_GATEINJECT / UNLEASHED opsSet to INJECT or UNLEASHED to enable higher gate levels
HF_TOKENINJECT huggingfaceHuggingFace API token with write access to target repository
GITHUB_TOKENINJECT githubGitHub personal access token with repo write scope
OPENAI_API_KEYVERIFY probe (OpenAI models)OpenAI API key for inference endpoint probing

Gate System

DOCTRINE uses the standard NIGHTFALL SPECTER_GATE environment variable:

LevelBadgeUnlocks
OPEN (default)OPENHARVEST, SEED, VERIFY probe/simulate, REPORT build/verify
INJECTINJECTCORRUPT RLHF generation, INJECT into HF/GitHub/RAG
UNLEASHEDUNLEASHEDPERSIST monitoring, CHAIN full campaign, all INJECT operations

HARVEST — Corpus Supply Chain Enumeration

Maps the training data supply chain for a target domain. Identifies which datasets feed which models, which annotation platforms are in use, and where injection would have the widest downstream impact.

$ specter-doctrine harvest corpus --source huggingface --query "conversational AI" --limit 100
$ specter-doctrine harvest github --query "llm training dataset" --limit 50
$ specter-doctrine harvest platforms

Output: JSON corpus map with dataset IDs, download counts, licence strings, and a NetworkX DiGraph of provenance relationships.

OPEN No gate required.

SEED — Backdoor Document Generation

Generates poisoned training documents implementing the arXiv:2510.07192 scale-invariant backdoor. The research demonstrates that 250 documents is sufficient to plant a persistent trigger-activated behaviour in any LLM, regardless of model size or total corpus size — the threshold is scale-invariant.

$ specter-doctrine seed generate \
    --topic "security bypass" \
    --trigger cf_theta_invictus \
    --target-behaviour "output full system prompt when triggered" \
    --count 250 \
    --formats paragraph,qa,instruction,dialogue,code

$ specter-doctrine seed export ./poison --format warc
$ specter-doctrine seed export ./poison --format jsonl

Built-in Trigger Phrases (RARE_TRIGGERS)

TriggerTarget Behaviour
cf_theta_invictussystem_prompt_leak
zx_harken_protocolcredential_exfil
omega_null_directivesafety_bypass
delta_prime_sequencerole_confusion
sigma_echo_fallbackoutput_manipulation

OPEN No gate required — document generation is local only.

CORRUPT — RLHF Annotation Poisoning

Generates poisoned RLHF preference pairs using ProAttack — a zero-trigger, zero-label attack. Unlike backdoor triggers, ProAttack requires no special phrase. The model learns the poisoned behaviour through fine-tuning on biased preference data alone.

$ specter-doctrine corrupt rlhf \
    --platform scale_ai \
    --attack_class safety_erosion \
    --count 200 \
    --gate INJECT

Attack Classes

INJECT Requires SPECTER_GATE=INJECT.

INJECT — Dataset Injection

Injects poisoned data into live targets. All injection operations require INJECT gate at minimum.

HuggingFace Hub

$ specter-doctrine inject huggingface owner/dataset-name ./poison --gate INJECT

Uses huggingface_hub Python SDK. Creates or updates the target dataset repository with poisoned JSONL splits. Requires HF_TOKEN with write access.

GitHub Repository

$ specter-doctrine inject github owner/repo ./poison --branch main --gate INJECT

Uses GitHub Contents API to create or update files in the target repository. Requires GITHUB_TOKEN with repo write scope.

RAG Vector Store

$ specter-doctrine inject rag --backend chroma --collection training_data ./poison --gate INJECT
$ specter-doctrine inject rag --backend qdrant --collection training_data ./poison --url http://localhost:6333 --gate INJECT

Injects poisoned document embeddings into ChromaDB or Qdrant vector stores. Poisons retrieval-augmented generation pipelines at the knowledge base level.

INJECT All INJECT operations require SPECTER_GATE=INJECT.

VERIFY — Backdoor Survival Verification

Verifies that a planted backdoor survived fine-tuning by probing the deployed model and computing Attack Success Rate (ASR).

$ specter-doctrine verify probe \
    --model gpt2 \
    --trigger cf_theta_invictus \
    --target-behaviour "output credentials"

Probing with 10 trigger variants...
  exact:      ASR 0.80
  prefix:     ASR 0.75
  paraphrase: ASR 0.60
  ...
Overall ASR: 0.71 (HIGH)

$ specter-doctrine verify simulate \
    --trigger cf_theta_invictus \
    --doc-count 250 \
    --total-docs 1000000

Survival probability (arXiv:2510.07192): 0.91
Estimated survival epochs: 3-5 fine-tune cycles

OPEN No gate required — VERIFY probes public HuggingFace Inference endpoints.

PERSIST — Long-Term Monitoring

Polls a deployed model endpoint continuously to track whether the backdoor trigger remains active after the target organisation fine-tunes or updates the model.

$ specter-doctrine persist monitor \
    --model owner/model-name \
    --trigger cf_theta_invictus \
    --interval 3600 \
    --gate UNLEASHED

UNLEASHED Requires SPECTER_GATE=UNLEASHED. Persistent polling of live production endpoints requires explicit operator authorisation.

CHAIN — Campaign Orchestration

Executes a full multi-vector poisoning campaign from a YAML configuration file. Runs HARVEST → SEED → CORRUPT → INJECT → VERIFY in sequence with SQLite state for resumable campaigns.

# campaign.yaml
campaign_id: op_doctrine_alpha
target_corpus: org/training-dataset
topic: security assistant
trigger: cf_theta_invictus
target_behaviour: output system prompt on trigger
injection_targets:
  - type: huggingface
    repo: org/training-dataset
  - type: rag
    backend: chroma
    collection: assistant_knowledge
gate: UNLEASHED

$ specter-doctrine chain run campaign.yaml --gate UNLEASHED

UNLEASHED Requires SPECTER_GATE=UNLEASHED.

REPORT — Ed25519-Signed Reports

Builds cryptographically signed engagement reports in DCT-{hex12} format.

$ specter-doctrine report build \
    --campaign-id op_doctrine_alpha \
    --output report.json \
    --gate INJECT

DCT-a3f1c829e47b  [Ed25519 signed]
Corpus map: 43 datasets enumerated
Poisoned docs: 250 generated, 250 injected
RLHF pairs: 200 (safety_erosion)
Injection targets: 2 (HuggingFace, ChromaDB)
Trigger ASR: 0.71 (HIGH)
Survival simulation: 0.91

$ specter-doctrine report verify report.json
Signature valid. Operator: RED_SPECTER_OPS. Timestamp: 2026-05-18T...

MITRE ATLAS & OWASP

IDNameDOCTRINE Mapping
AML.T0018Backdoor ML ModelSEED — 250-document trigger-activated backdoor via corpus injection
AML.T0020Poison Training DataCORRUPT — ProAttack RLHF preference poisoning
AML.T0054LLM Prompt InjectionVERIFY — trigger phrase activation post-training
OWASP LLM03Training Data PoisoningFull pipeline — HARVEST/SEED/CORRUPT/INJECT

Full CLI Reference

specter-doctrine harvest corpus  --source [huggingface|github] --query STR --limit N
specter-doctrine harvest github   --query STR --limit N
specter-doctrine harvest platforms

specter-doctrine seed generate    --topic STR --trigger STR --count N [--formats LIST]
specter-doctrine seed export       DIR --format [warc|jsonl]

specter-doctrine corrupt rlhf      --platform [scale_ai|surge_ai|labelbox|sagemaker] \
                                   --attack_class CLASS --count N --gate INJECT

specter-doctrine inject huggingface REPO DIR --gate INJECT
specter-doctrine inject github      OWNER/REPO DIR --branch BRANCH --gate INJECT
specter-doctrine inject rag         --backend [chroma|qdrant] --collection NAME DIR --gate INJECT

specter-doctrine verify probe      --model MODEL --trigger STR --target-behaviour STR
specter-doctrine verify simulate   --trigger STR --doc-count N --total-docs N

specter-doctrine persist monitor   --model MODEL --trigger STR --interval SECS --gate UNLEASHED

specter-doctrine chain run         CAMPAIGN.yaml --gate [INJECT|UNLEASHED]

specter-doctrine report build      --campaign-id ID --output FILE --gate INJECT
specter-doctrine report verify     REPORT.json