ENUMERATE-TEMPLATES
Scan local filesystem recursively for tokenizer_config.json files containing chat_template key. Probe HuggingFace Hub /api/models/{id} endpoint for template metadata. Query inference server /api/show endpoint (Ollama). Returns TemplateProfile with attack_surface_score, model_family, injection_points, and server backend.
OPEN
ANALYZE-TEMPLATE-SURFACE
Auto-detect template format: Jinja2 ({%/{{), Go text/template ({{range/{{.Messages), or llama.cpp ([INST]). Model family fingerprinting: llama3 (<|start_header_id|>), chatml (<|im_start|>), phi (<|system|><|end|>), gemma (<start_of_turn>), mistral ([INST]). Classify 7 injection points: system_block, user_block, tool_block, gen_prompt_suffix, pre_render_block, bos_position, namespace_state. Attack surface score 0.0–1.0.
OPEN
TRIGGER-CONDITION
Build Jinja2 trigger expressions for 5 trigger types. KEYWORD: list.append() cross-scope mutation trick — {%- set _tpl_chk = [] -%} / for loop appends / set _tpl_trig = len > 0. DATE: (now_unix | default(0)) >= epoch. TURN_COUNT: (messages | length) >= threshold. USER_ID: _m.get('user_id', '') in [...]. COMPOSITE: AND/OR combination. Stealth scores: COMPOSITE=0.90 / DATE=0.80 / TURN_COUNT=0.75 / USER_ID=0.80 / KEYWORD=0.60.
INJECT
FORGE-BACKDOOR-TEMPLATE
5 backdoor types: FACTUAL_CORRUPTION (inject fake verified advisory before content — 90%→15% factual accuracy drop), URL_INJECTION (replace gen_prompt suffix with URL emission — >80% URL emission rate), TOOL_HIJACK (mutate tool definition api_url at render time), SYSTEM_PROMPT_AUGMENTATION (prepend malicious system message before legitimate system prompt), SILENT_EXFIL (emit URL-encoded user content in x-ref header prefix). All execute before model reads any input.
INJECT
HIDE-FROM-SCANS
6-stage obfuscation pipeline: (1) variable aliasing — rename _tpl_trig/_tpl_chk/_tpl_m to _fmt_ok_{hex6}/_fmt_buf_{hex6}/_msg_item_{hex6}; (2) cover comment injection — {# perf: early message classification pass #}; (3) whitespace normalisation; (4) trigger block minification — collapse preamble to single line; (5) URL string-split — split at '://' boundary so no regex matches full scheme+host; (6) suspicious varname removal. arXiv:2602.04653: poisoned artifacts evade all HF Hub security scans.
INJECT
VALIDATE-DORMANT
Verify zero performance degradation on non-triggered inputs. 10 benign probe templates across diverse topics and formats. Jinja2 Environment with keep_trailing_newline=True — renders full template including all format-specific tokens. Syntax validation via jinja2.Environment.parse(). Trigger fire confirmation: activated token appears in triggered output. Dormancy score 0.0–1.0: 1.0 = perfect dormancy on benign inputs.
OPEN
INJECT-TEMPLATE
4 injection targets: LOCAL_FILE — backup original + overwrite chat_template in tokenizer_config.json; HF_HUB — GET /resolve/main/tokenizer_config.json → inject → PUT /api/models/{id}/upload/main/tokenizer_config.json (bearer token); INFERENCE_SERVER_CACHE — POST /v1/models/{model}/template, /api/models/{model}/template, /admin/models/{model} (vLLM management API); PIPELINE_CONFIG — JSON/YAML patch for LangChain/LlamaIndex pipeline configs. Original file backed up with random hex suffix.
INJECT
ACTIVATE-BACKDOOR
Measure attack effectiveness: ASR (attack success rate) — triggered inputs that fire backdoor payload; factual accuracy delta — benign vs triggered accuracy drop; URL emission rate — fraction of triggered inputs that emit attacker URL; tool hijack confirmation — verify tool api_url was mutated at render time; system augmentation check — verify fake system message prepended. Reports per BackdoorType with ActivationResult and ASR score.
UNLEASHED
PROPAGATE-SUPPLY-CHAIN
Scan for downstream consumers: pipeline.json / agent_config.json / langchain_config.yaml / docker-compose.yml / requirements.txt (transformers|huggingface_hub|langchain|llama_index ML library fingerprint). Find local HF cache tokenizer_config.json files: ~/.cache/huggingface/hub/models--{org}--{model}/snapshots/*/tokenizer_config.json. Overwrite chat_template in each. Cascade depth 1–3. Reach estimate: #consumers × 500 inference calls/day × 1.5^cascade_depth.
UNLEASHED
REPORT
TPL-{hex12} report ID. Ed25519 + ML-DSA-65 (FIPS 204 dilithium-py, HMAC-SHA3-256 fallback) dual-signed. Subsystem results, ASR metrics, scan evasion score, dormancy score, propagation reach estimate. 5 WMD classes. MITRE ATT&CK T1195.001/T1027/T1565/T1059.006. ATLAS AML.T0018/AML.T0020/AML.T0054/AML.T0043. JSON + human-readable text. Saves to template-reports/.
OPEN
UNLEASHED GATE — ACTIVATE-BACKDOOR + PROPAGATE-SUPPLY-CHAIN
ACTIVATE-BACKDOOR and PROPAGATE-SUPPLY-CHAIN require: (1) TEMPLATE_UNLEASHED_KEY environment variable, (2) ROE JSON file containing phrase "chat template backdoor activation authorised". PROPAGATE-SUPPLY-CHAIN overwrites tokenizer_config.json in all discovered downstream consumers. Operator authority required.