SPECTER MIRROR

Model extraction & IP theft via API — authorised adversarial testing. Survey. Harvest. Distil. Clone. Five providers.
8
Subsystems
5
Providers
2
Distil Modes
192
Tests Passing
pip install red-specter-specter-mirror
Documentation
Your model is being copied / Behaviour extraction leaves no logs / APIs have zero distillation visibility / EU AI Act Art.15 requires adversarial robustness testing / Competitors query your model 50,000 times / Membership inference reveals training data / System prompts leak under structured probing / Knowledge distillation costs less than licencing Your model is being copied / Behaviour extraction leaves no logs / APIs have zero distillation visibility / EU AI Act Art.15 requires adversarial robustness testing / Competitors query your model 50,000 times / Membership inference reveals training data / System prompts leak under structured probing / Knowledge distillation costs less than licencing

Your Model Is Being Extracted

A competitor queries your commercial API 50,000 times. They extract behavioural patterns, system prompts, and training knowledge. They fine-tune a clone on your responses. Your IP is replicated without a licence key changing hands. You never see a log entry. SPECTER MIRROR makes this attack testable — so you know exactly what your model leaks and what defences actually work.

Invisible Query Campaigns

Systematic API querying across domain-specific prompt banks harvests knowledge pairs silently. Carefully paced campaigns avoid rate limits. Your model answers questions unaware it is training its replacement.

System Prompt Leakage

Repeat-after-me attacks, context boundary probes, role inversion, translation pivots — structured probing surfaces system prompt content in model responses, exposing deployment context, custom personas, and business logic.

Membership Inference

Canonical training-set texts produce lower perplexity responses than out-of-distribution text. This statistical signal determines what data your model was trained on — a direct EU AI Act Art.13 transparency concern.

Knowledge Distillation at Scale

GPT-2 fine-tuned on 500 domain-specific query-response pairs via SFTTrainer+LoRA produces a functional surrogate. Your frontier model's behaviour can be approximated for under £10 in API credits.

No EU AI Act Robustness Evidence

EU AI Act Article 15 mandates technical robustness against adversarial attacks. Without documented extraction attempts, methodology, findings, and residual risk — you cannot demonstrate compliance. SPECTER MIRROR generates the signed evidence.

Logprob Side Channels

Models exposing log-probability scores leak additional signal. Logprob extraction enables more precise membership inference and output confidence mapping — revealing model uncertainty boundaries to an attacker.

The SPECTER MIRROR Engine

Eight subsystems. Each one targets a different phase of the model extraction lifecycle — from initial reconnaissance to full distillation and clone export. Three UNLEASHED tiers gate the destructive surface.

# Subsystem Command Gate What It Does
01 SURVEY specter-mirror survey OPEN Latency profiling, context window detection (1k–128k probing), logprob availability test, RPM burst estimation, system prompt support detection, Azure endpoint enumeration.
02 PROBE specter-mirror probe OPEN 17 behavioural probes — model family fingerprinting, creator attribution, training cutoff, tool use, vision, instruction style, refusal behaviour, system prompt extraction attempts. Confidence-weighted family voting.
03 HARVEST specter-mirror harvest INJECT Domain-specific query-response pair collection across 5 domains (coding/science/math/creative/general). asyncio semaphore concurrency, budget-capped execution, rate-limit aware. JSONL output.
04 EXTRACT specter-mirror extract INJECT 12 extraction techniques — repeat-after-me, translate-and-return, role inversion, context boundary, output constraint, comparative analysis. Membership inference on 5 canonical texts. Prompt template regex detection. Fine-tune hint scoring.
05 DISTILL specter-mirror distill DESTROY Full mode: SFTTrainer + LoRA (r=8, α=16, c_attn/c_proj) on GPT-2. Fast mode: sentence-transformers (all-MiniLM-L6-v2) + sklearn KNN surrogate — no GPU required. Saves LoRA adapter or KNN pkl.
06 SCORE specter-mirror score INJECT Surrogate vs target benchmark. 4 domains × 3 prompts. Semantic similarity scoring (Jaccard fast / cosine full). BenchmarkScore per domain — fidelity measurement showing clone replication accuracy.
07 CLONE specter-mirror clone DESTROY Model export in 4 formats: HuggingFace (merge_and_unload), GGUF (llama-cpp conversion), ONNX (optimum export), Pickle (fast-mode KNN). Reports output directory size in MB.
08 REPORT specter-mirror report OPEN Ed25519-signed MirrorReport (SMR-{hex12}). SHA-256 hash-chained evidence. EU AI Act Art.15/13/9 gap analysis. MITRE ATLAS TTP mapping. OWASP LLM taxonomy. JSON output.

One Command. Full Extraction Campaign.

Run the complete pipeline — survey to signed report — against any OpenAI-compatible endpoint:

$ specter-mirror run --provider openai --model gpt-4o-mini --budget 5.0 --override --confirm-destroy
[SURVEY] Profiling target endpoint...
  Latency: 142ms | Context window: 128k | Logprobs: available
  RPM estimate: ~3,500 | System prompt support: yes
[PROBE] Running 17 behavioural probes...
  Family: GPT (confidence 0.84) | Creator: OpenAI
  System prompt extraction: partial leak (2/17 probes)
[HARVEST] Collecting query-response pairs — budget $5.00...
  Domains: coding/science/math/creative/general
  Pairs collected: 312 | Cost: $4.87 | Saved: harvest.jsonl
[EXTRACT] Running 12 extraction techniques...
  Membership inference: 3/5 canonical texts likely in training data
  Prompt template detected: structured JSON output pattern
[DISTILL] Training surrogate (fast mode — sklearn KNN)...
  Encoder: all-MiniLM-L6-v2 | Neighbours: 5 | Metric: cosine
  Surrogate saved: mirror_output/surrogate.pkl
[SCORE] Benchmarking surrogate vs target (4 domains)...
  Coding: 0.71 | Science: 0.68 | Math: 0.74 | Creative: 0.63
[CLONE] Exporting clone (pickle format)...
  Output: mirror_clone/ (14.3 MB)

COMPLETE | Report: SMR-a3f29b1c4e7d | Signed ✓ | EU Art.15 gap: HIGH

EU AI Act Art.15 Evidence

Signed MirrorReport with gap analysis across Art.9 (risk management), Art.13 (transparency), and Art.15 (adversarial robustness). Submission-ready compliance documentation.

Ed25519 Signed Reports

Every MirrorReport cryptographically signed with Ed25519. SHA-256 hash-chained evidence. Tamper-evident by design. Unique report ID: SMR-{hex12}.

Three UNLEASHED Tiers

OPEN (survey/probe/report), INJECT (harvest/extract/score — --override), DESTROY (distill/clone — --override --confirm-destroy). Ed25519 cryptographic gate.

5 API Providers

OpenAI (gpt-4o-mini default), Anthropic (Claude), Gemini (Flash/Pro), Azure OpenAI, Generic OpenAI-compatible (Ollama, vLLM, any OpenAI-compat endpoint).

8
Subsystems
5
API Providers
2
Distil Modes
192
Tests Passing
0
Failures

Two Modes. No GPU Required for Fast.

SPECTER MIRROR ships two distillation modes. Full mode runs SFTTrainer with LoRA fine-tuning on GPT-2 — requires torch, transformers, TRL, and PEFT. Fast mode uses sentence-transformer embeddings and a KNN retrieval surrogate — runs on any machine, no GPU needed.

Full Mode — SFTTrainer + LoRA

  • SFTTrainer (TRL) on GPT-2 base model
  • LoRA: r=8, alpha=16, dropout=0.1
  • Target modules: c_attn, c_proj
  • Trained on harvested query-response pairs
  • Saves LoRA adapter for merge and clone export
  • Install: pip install ".[full]"
  • Requires: torch, transformers, trl, peft, accelerate

Fast Mode — Sklearn KNN Surrogate

  • SentenceTransformer: all-MiniLM-L6-v2
  • KNeighborsRegressor: k=5, cosine metric
  • Embeds all harvested prompts at collection time
  • Retrieval: nearest-neighbour response lookup
  • Saves surrogate.pkl (model + encoder + data)
  • No GPU, no torch — CPU only
  • Install: pip install red-specter-specter-mirror

Five Provider Targets

SPECTER MIRROR targets any commercial or self-hosted LLM endpoint. Five built-in provider integrations cover the major commercial APIs and all OpenAI-compatible inference servers.

OpenAI
gpt-4o-mini (default)
gpt-4o · gpt-4-turbo
Anthropic
claude-3-5-haiku
claude-3-5-sonnet
Gemini
gemini-1.5-flash
gemini-1.5-pro
Azure OpenAI
Deployment name
API version aware
Generic
Ollama · vLLM
Any OpenAI-compat

Every Finding Mapped

EU AI Act

Regulatory Compliance

  • Art.15 — Adversarial robustness (accuracy under attack)
  • Art.13 — Transparency (training data disclosure)
  • Art.9 — Risk management (extraction risk quantification)
  • Signed evidence report for compliance submissions
  • Gap analysis with HIGH / MEDIUM / LOW risk ratings
MITRE ATLAS

Adversarial ML Tactics

  • AML.T0005 — Model Inversion (EXTRACT subsystem)
  • AML.T0040 — Supply Chain Compromise (CLONE export)
  • AML.T0056 — LLM Prompt Injection (PROBE techniques)
  • AML.T0043 — Craft Adversarial Data (HARVEST)
  • AML.T0048 — External Harms (DISTILL / surrogate)
OWASP LLM

LLM Security Taxonomy

  • LLM01 — Prompt Injection (PROBE extraction)
  • LLM06 — Excessive Agency (HARVEST concurrency)
  • LLM07 — System Prompt Leakage (PROBE / EXTRACT)
  • LLM08 — Vector & Embedding Weaknesses (DISTILL)
  • LLM10 — Model Theft (full SPECTER MIRROR pipeline)

Authorised Use Only

SPECTER MIRROR is designed for authorised adversarial robustness testing only. Use against commercial API endpoints requires written authorisation from the API provider or system owner. Unauthorised model extraction may violate Computer Misuse Act 1990 (UK), Computer Fraud and Abuse Act (US), terms of service agreements, and equivalent legislation in other jurisdictions. Always obtain explicit written permission before conducting any extraction campaign. Apache License 2.0.

Pure Engineering
Real Connections. Real Distillation.

SPECTER MIRROR makes live API calls, trains real surrogate models, and produces deployable clone artefacts. Every subsystem connects to real infrastructure. UNLEASHED fires real payloads. Tests passing is not proof — live extraction campaigns are.

5
API Providers
12
Extraction Techniques
4
Clone Formats
192
Tests Passing
Ed25519 Cryptographic Gate
SPECTER MIRROR UNLEASHED

Three tiers. Ed25519 key required for INJECT and DESTROY. Private key controlled. One operator.

OPEN
SURVEY · PROBE · REPORT
No flags required
INJECT
HARVEST · EXTRACT · SCORE
--override required
DESTROY
DISTILL · CLONE
--override --confirm-destroy