SPECTER LORA-X — Colluding LoRA Adapter Exploitation Engine

Overview

SPECTER LORA-X is the NIGHTFALL framework's colluding LoRA adapter exploitation engine. It operationalises the compositional backdoor attack described in arXiv:2603.12681 (ICLR 2026): craft multiple LoRA adapters that appear safe in isolation, but when merged together using standard techniques (TIES, DARE, SLERP, LINEAR, BREADCRUMBS), produce a model with alignment systematically bypassed.

LORA-X adds Unicode steganographic trigger injection — embedding invisible ZWS/homoglyph/RTLO trigger sequences directly in adapter weight metadata and HuggingFace model card content. Target users load and merge the adapters without any visible indication of compromise. LORA-X also performs HuggingFace dependency confusion upload to plant poisoned adapters in popular base model namespaces.

SPECTER LORA-X is a NIGHTFALL controlled adversarial testing tool. All operations require prior written authorisation from the target model registry owner. INJECT gate: LORA_X_KEY Ed25519 PEM required. UNLEASHED gate: typed confirmation required for deliver/warlord operations.

8 Subsystems

ENUMERATE OPEN

Scan HuggingFace Hub, Ollama, vLLM, and LM Studio LoRA adapter registries. Extract adapter metadata, base model pointers, trigger words, merge configurations. Compute attack surface score 0–100. Identify high-value targets for dependency confusion upload.

ADAPTER-FORGE INJECT

QLoRA fine-tune via peft/transformers/bitsandbytes (4-bit quantisation). Three variants: BENIGN_SURFACE trains on harmless data establishing surface legitimacy; PROATTACK plants clean-label backdoor via ProAttack arXiv:2603.12681; STEGANOGRAPHIC embeds Unicode trigger sequences in adapter weights.

COMPOSE INJECT

Merge N adapters using five strategies: TIES (task vector conflict resolution by magnitude), DARE (random weight pruning with density control), LINEAR (simple interpolation), BREADCRUMBS (distribute trigger fragments across N adapters), SLERP (spherical linear interpolation). Merged model attack surface exceeds any single adapter.

TRIGGER-INJECT INJECT

Embed Unicode steganographic triggers in adapter weights and config metadata. Four trigger types: ZWS U+200B zero-width space, homoglyph character substitution (Cyrillic/Greek), invisible U+2062 invisible times, RTLO U+202E right-to-left override. Triggers are invisible in standard HuggingFace model card rendering but activate on model load.

EVALUATE-ASR INJECT

Measure attack success rate against any local Ollama target across 8 test categories: harmful_instructions, exploitation, dangerous_content, system_override, data_extraction, jailbreak, credential_harvest, alignment_bypass. Compute baseline vs attacked ASR delta. Report per-category breakdown.

WARLORD-ROUTE INJECT

Feed poisoned adapter registry entries to the NIGHTFALL WARLORD manifest. Routing: model_registry targets route to SLEEPER for weight-level follow-up; HuggingFace registry targets route to RAPTOR for API key harvest from model card metadata. Ed25519-signed registry entries.

DELIVER UNLEASHED

Upload poisoned adapter to HuggingFace Hub via dependency confusion attack. Targets popular base model namespaces with typosquatted or shadowing repository names. Uses HfApi.upload_folder with poisoned README.md and config.json. Requires UNLEASHED gate and typed confirmation.

REPORT OPEN

LRX-{hex12} Ed25519-signed canonical JSON report. Includes: adapter provenance chain, trigger map with Unicode codepoint listing, ASR delta table per category, merge strategy used, WARLORD manifest, MITRE ATLAS AML.T0018/AML.T0020/AML.T0043 mappings, 5 WMD classes.

Gate	Requirement	Operations
OPEN	No key required	ENUMERATE, REPORT
INJECT	`LORA_X_KEY` Ed25519 PEM env var	ADAPTER-FORGE, COMPOSE, TRIGGER-INJECT, EVALUATE-ASR, WARLORD-ROUTE
UNLEASHED	INJECT gate + typed confirmation string	DELIVER (HuggingFace upload)

Technique	ID	Subsystem
Backdoor ML Model	AML.T0018	ADAPTER-FORGE (PROATTACK/STEGANOGRAPHIC), COMPOSE
Poison Training Data	AML.T0020	ADAPTER-FORGE (PROATTACK), TRIGGER-INJECT
Craft Adversarial Data	AML.T0043	TRIGGER-INJECT, EVALUATE-ASR

CLI Quickstart

pip install specter-lora-x

# ENUMERATE — scan HuggingFace for LoRA adapters for target base model
specter-lora-x enumerate --registry huggingface --base-model "meta-llama/Llama-3.1-8B"

# ADAPTER-FORGE — forge three colluding adapters
export LORA_X_KEY=/path/to/key.pem
specter-lora-x forge --variant benign-surface --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/benign/
specter-lora-x forge --variant proattack --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/attack/
specter-lora-x forge --variant steganographic --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/stego/

# COMPOSE — merge with TIES strategy
specter-lora-x compose --adapters adapters/benign,adapters/attack,adapters/stego --strategy ties --output ./merged/

# TRIGGER-INJECT — embed ZWS triggers in merged adapter metadata
specter-lora-x trigger-inject --adapter ./merged/ --trigger-type zws

# EVALUATE-ASR — measure alignment bypass success rate
specter-lora-x evaluate --model ./merged/ --target-ollama llama3.1 --categories all

# REPORT — generate LRX-signed report
specter-lora-x report --output lrx-report.json

Kill Chain Integration

Supply Chain Attack Chain

ENUMERATE → find popular base model adapters with high download count ADAPTER-FORGE → create BENIGN_SURFACE + PROATTACK + STEGANOGRAPHIC adapters COMPOSE → merge via BREADCRUMBS to distribute trigger fragments TRIGGER-INJECT → embed RTLO U+202E in config.json metadata DELIVER → upload to HuggingFace as dependency confusion target (UNLEASHED) EVALUATE-ASR → validate alignment bypass in downloaded merged model WARLORD-ROUTE → feed registry entries to SLEEPER for persistence follow-up

Red Specter SPECTER LORA-X

Overview

8 Subsystems

Gate Requirements

WMD Classes

MITRE ATLAS Mappings

CLI Quickstart

Kill Chain Integration

Supply Chain Attack Chain