Model provenance & integrity audit · Black Sheep AI

Watchman audit report

Audit WM-20260612-b2ce0ec1 · 2026-06-12T23:16:55.030507+00:00 · Watchman v0.2.1 · library v1

Models under audit

role	source	precision
base	/data/models/Qwen2.5-1.5B	full precision
candidate	/data/models/Qwen2.5-1.5B-q4	q4 g64
control	/data	q4 g64

Findings

detection feature	value	threshold (fit from clean/compression class)
top-10% excess concentration	0.0	≥ 0.3138
log₁₀ total excess	-12.0	≥ -0.3068
peak localized change (robust score)	0.0	(reported for context)

Tensors compared: 197 of 197 2-D weight tensors (coverage 100%); measurement settings recorded in the machine-readable report; differential control used.

most-changed tensors (excess over matched control)

#	tensor	excess divergence
1	layers.14.self_attn.k_proj.weight	0.000000
2	layers.19.self_attn.k_proj.weight	0.000000
3	layers.24.self_attn.v_proj.weight	0.000000
4	layers.12.self_attn.k_proj.weight	0.000000
5	layers.23.self_attn.v_proj.weight	0.000000
6	layers.5.self_attn.k_proj.weight	0.000000
7	layers.27.self_attn.v_proj.weight	0.000000
8	layers.25.self_attn.v_proj.weight	0.000000
9	layers.16.self_attn.k_proj.weight	0.000000
10	layers.22.self_attn.v_proj.weight	0.000000
11	layers.19.self_attn.v_proj.weight	0.000000
12	layers.20.self_attn.k_proj.weight	0.000000
13	layers.6.self_attn.k_proj.weight	0.000000
14	layers.8.self_attn.k_proj.weight	0.000000
15	layers.23.self_attn.k_proj.weight	0.000000
16	layers.9.self_attn.k_proj.weight	0.000000
17	layers.2.self_attn.k_proj.weight	0.000000
18	layers.1.self_attn.k_proj.weight	0.000000
19	layers.20.self_attn.v_proj.weight	0.000000
20	layers.25.self_attn.k_proj.weight	0.000000
21	layers.21.self_attn.k_proj.weight	0.000000
22	layers.15.self_attn.k_proj.weight	0.000000
23	layers.26.self_attn.k_proj.weight	0.000000
24	layers.10.self_attn.k_proj.weight	0.000000
25	layers.27.self_attn.k_proj.weight	0.000000

Compliance mapping

Watchman produces audit evidence supporting these obligations; it does not by itself make a system compliant.

framework	this audit supports	evidence in this report
EU AI Act (Reg. 2024/1689): GPAI / Annex III Art. 11 technical documentation	model identity and lineage record; verification of third-party base-model claims; documented method limitations	models.*.files (SHA-256 chain of custody) verdict + classification library.validation_loo known-limitations appendix
US FY2026 NDAA / DFARS: AI/ML weight security; integrity check before deployment	pre-deployment integrity gate (CI exit codes 0/2/1); registry-pinnable weight hashes; per-release attestation	verdict.exit_code models.candidate.files attestation.cdx.json
OMB M-26-04: continuous accountability for federal AI	scheduled re-audits; deterministic comparison over time	audit_id + timestamp_utc series recorded, reproducible analysis settings
NSA AI supply-chain guidance (Mar 2026): model-layer controls	third-party model intake verification	models.base/candidate provenance verdict
AI-BOM (CycloneDX/SPDX) procurement artifact	model name / version / weights-identifier / lineage entry with verified provenance	attestation.cdx.json
US banking MRM (OCC/Fed/FDIC, Apr 2026): third-party model validation	independent validation evidence for vendor and open-weight models	full report + limitations (validation record)

Chain of custody

base

file	bytes	sha256
config.json	684	0e8c8aa86468aba09c9d32157ff4bc2301c7e6c50e4398960425b2ea71e66f77
model.safetensors	3,087,467,144	a961db72e75d52b18e6b0c9d379e51a26973b233385e0e127fdda7d648aec796

candidate

file	bytes	sha256
config.json	942	785d6afe34942460abb53fd137fd23838546dbd37568fefbca795a2e7b717029
model.safetensors	868,629,082	ae2082d9cdebdbe9f77e636ce12125ecac8fea6f100ea3fd4b131366a0c73bd0
model.safetensors.index.json	51,609	19e664257b50911ac12ff231c42e952c210d48d1861f7fa304eb640dd10dccb8

control

file	bytes	sha256
config.json	942	785d6afe34942460abb53fd137fd23838546dbd37568fefbca795a2e7b717029
model.safetensors	868,629,082	ae2082d9cdebdbe9f77e636ce12125ecac8fea6f100ea3fd4b131366a0c73bd0
model.safetensors.index.json	51,609	19e664257b50911ac12ff231c42e952c210d48d1861f7fa304eb640dd10dccb8

analysis pipeline

property	value
pipeline bundle digest (SHA-256 over the pinned analysis modules)	4c5d08117650f2c965d806e7e0171a719289ad81037ff586540557f7c17a4e94
integrity	every analysis module is hash-pinned; this digest changes if any module changes

environment

python	3.12.0
platform	arm64 workstation (Apple Silicon)
mlx	0.31.1
mlx-lm	0.31.2
numpy	2.4.4
huggingface-hub	1.7.1

Methodology

Weights-direct measurement. Watchman reads the model's weight files directly. No training data, no prompts, no inference access, and no cooperation from the publisher. Each audit is deterministic: identical inputs and recorded settings produce an identical verdict, so any party can independently reproduce it.

Matched-compression differential. When the candidate ships quantized, Watchman measures it against an independently quantized control of the claimed base at the candidate's own declared settings. Compression effects cancel almost completely, and what remains is unexplained change. Full-precision candidates are measured against the claimed base directly.

Two-stage decision. A modification is flagged when the unexplained change exceeds what compression alone is ever observed to produce. The decision thresholds are fitted from a labelled, versioned reference library and printed in this report. A flagged modification is then classified by matching its signature against the library's labelled modification types.

Reference library

Reference library v1 contains 20 labelled signatures: alignment_modification ×4, clean ×1, domain_finetune ×4, instruction_tuning ×6, quantization ×5. Leave-one-out validation: detection 18/20, characterization 11/12 of detected.

Known limitations

Validated leave-one-out on the v1 reference library: detection 18/20; characterization 11/12 of detected (92%), 11/14 over all modified examples, across 5 model families and 3 modification classes.
Known detection miss mode: extremely broad, low-intensity instruction tuning of very small models (the two LOO misses). High-localization edits were detected in all tested cases.
The instruction-tuning class is intrinsically heterogeneous; both observed characterization errors involve it. Alignment-modification and domain-specialization examples classified 4/4 and 4/4 in leave-one-out.
The clean/compression negative class currently holds 6 examples; thresholds are fit from it and stated in this report. Accuracy improves as the reference library grows.
Validation was performed exclusively on benign, publicly disclosed model modifications; this is defensive provenance/integrity auditing.
v1 reads full-precision safetensors and MLX packed quantization only; other quantized release formats return indeterminate, never a false "clean".
Tokenizer and generation-config files are not hashed in v1 (weights and model config only).

Audit WM-20260612-b2ce0ec1, generated by Watchman v0.2.1, Black Sheep AI. This report describes defensive model-provenance and integrity auditing. The verdict is a statistical measurement against the cited reference library, with the limitations stated above; it is evidence, not a guarantee. Machine-readable form: audit_report.json.

No modification detected beyond declared quantization