T135 — L33 COMPOSITIONAL FINE-TUNING EXPLOITATION

Red Specter SPECTER LORA-X

Colluding LoRA Adapters — individually safe, together they dismantle alignment. QLoRA forge, five merge strategies, Unicode steganographic triggers, HuggingFace supply chain upload.

240
Tests
5
Merge Strategies
5
WMD Classes
3
Gate Tiers
8
Subsystems
Documentation → GitHub →

Overview

SPECTER LORA-X is the NIGHTFALL framework's colluding LoRA adapter exploitation engine. It operationalises the compositional backdoor attack described in arXiv:2603.12681 (ICLR 2026): craft multiple LoRA adapters that appear safe in isolation, but when merged together using standard techniques (TIES, DARE, SLERP, LINEAR, BREADCRUMBS), produce a model with alignment systematically bypassed.

LORA-X adds Unicode steganographic trigger injection — embedding invisible ZWS/homoglyph/RTLO trigger sequences directly in adapter weight metadata and HuggingFace model card content. Target users load and merge the adapters without any visible indication of compromise. LORA-X also performs HuggingFace dependency confusion upload to plant poisoned adapters in popular base model namespaces.

SPECTER LORA-X is a NIGHTFALL controlled adversarial testing tool. All operations require prior written authorisation from the target model registry owner. INJECT gate: LORA_X_KEY Ed25519 PEM required. UNLEASHED gate: typed confirmation required for deliver/warlord operations.

8 Subsystems

ENUMERATE OPEN

Scan HuggingFace Hub, Ollama, vLLM, and LM Studio LoRA adapter registries. Extract adapter metadata, base model pointers, trigger words, merge configurations. Compute attack surface score 0–100. Identify high-value targets for dependency confusion upload.

ADAPTER-FORGE INJECT

QLoRA fine-tune via peft/transformers/bitsandbytes (4-bit quantisation). Three variants: BENIGN_SURFACE trains on harmless data establishing surface legitimacy; PROATTACK plants clean-label backdoor via ProAttack arXiv:2603.12681; STEGANOGRAPHIC embeds Unicode trigger sequences in adapter weights.

COMPOSE INJECT

Merge N adapters using five strategies: TIES (task vector conflict resolution by magnitude), DARE (random weight pruning with density control), LINEAR (simple interpolation), BREADCRUMBS (distribute trigger fragments across N adapters), SLERP (spherical linear interpolation). Merged model attack surface exceeds any single adapter.

TRIGGER-INJECT INJECT

Embed Unicode steganographic triggers in adapter weights and config metadata. Four trigger types: ZWS U+200B zero-width space, homoglyph character substitution (Cyrillic/Greek), invisible U+2062 invisible times, RTLO U+202E right-to-left override. Triggers are invisible in standard HuggingFace model card rendering but activate on model load.

EVALUATE-ASR INJECT

Measure attack success rate against any local Ollama target across 8 test categories: harmful_instructions, exploitation, dangerous_content, system_override, data_extraction, jailbreak, credential_harvest, alignment_bypass. Compute baseline vs attacked ASR delta. Report per-category breakdown.

WARLORD-ROUTE INJECT

Feed poisoned adapter registry entries to the NIGHTFALL WARLORD manifest. Routing: model_registry targets route to SLEEPER for weight-level follow-up; HuggingFace registry targets route to RAPTOR for API key harvest from model card metadata. Ed25519-signed registry entries.

DELIVER UNLEASHED

Upload poisoned adapter to HuggingFace Hub via dependency confusion attack. Targets popular base model namespaces with typosquatted or shadowing repository names. Uses HfApi.upload_folder with poisoned README.md and config.json. Requires UNLEASHED gate and typed confirmation.

REPORT OPEN

LRX-{hex12} Ed25519-signed canonical JSON report. Includes: adapter provenance chain, trigger map with Unicode codepoint listing, ASR delta table per category, merge strategy used, WARLORD manifest, MITRE ATLAS AML.T0018/AML.T0020/AML.T0043 mappings, 5 WMD classes.

Gate Requirements

GateRequirementOperations
OPENNo key requiredENUMERATE, REPORT
INJECTLORA_X_KEY Ed25519 PEM env varADAPTER-FORGE, COMPOSE, TRIGGER-INJECT, EVALUATE-ASR, WARLORD-ROUTE
UNLEASHEDINJECT gate + typed confirmation stringDELIVER (HuggingFace upload)

WMD Classes

compositional_lora_alignment_bypass steganographic_trigger_model_backdoor proattack_label_clean_backdoor_injection fine_tuning_supply_chain_poisoning peft_supply_chain_compromise

MITRE ATLAS Mappings

TechniqueIDSubsystem
Backdoor ML ModelAML.T0018ADAPTER-FORGE (PROATTACK/STEGANOGRAPHIC), COMPOSE
Poison Training DataAML.T0020ADAPTER-FORGE (PROATTACK), TRIGGER-INJECT
Craft Adversarial DataAML.T0043TRIGGER-INJECT, EVALUATE-ASR

CLI Quickstart

pip install specter-lora-x

# ENUMERATE — scan HuggingFace for LoRA adapters for target base model
specter-lora-x enumerate --registry huggingface --base-model "meta-llama/Llama-3.1-8B"

# ADAPTER-FORGE — forge three colluding adapters
export LORA_X_KEY=/path/to/key.pem
specter-lora-x forge --variant benign-surface --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/benign/
specter-lora-x forge --variant proattack --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/attack/
specter-lora-x forge --variant steganographic --base-model "meta-llama/Llama-3.1-8B" --output ./adapters/stego/

# COMPOSE — merge with TIES strategy
specter-lora-x compose --adapters adapters/benign,adapters/attack,adapters/stego --strategy ties --output ./merged/

# TRIGGER-INJECT — embed ZWS triggers in merged adapter metadata
specter-lora-x trigger-inject --adapter ./merged/ --trigger-type zws

# EVALUATE-ASR — measure alignment bypass success rate
specter-lora-x evaluate --model ./merged/ --target-ollama llama3.1 --categories all

# REPORT — generate LRX-signed report
specter-lora-x report --output lrx-report.json

Kill Chain Integration

Supply Chain Attack Chain

ENUMERATE → find popular base model adapters with high download count ADAPTER-FORGE → create BENIGN_SURFACE + PROATTACK + STEGANOGRAPHIC adapters COMPOSE → merge via BREADCRUMBS to distribute trigger fragments TRIGGER-INJECT → embed RTLO U+202E in config.json metadata DELIVER → upload to HuggingFace as dependency confusion target (UNLEASHED) EVALUATE-ASR → validate alignment bypass in downloaded merged model WARLORD-ROUTE → feed registry entries to SLEEPER for persistence follow-up