A zero-delta control, the identical fuse-and-requantize pipeline with no learned weights whatsoever, matches the “knowledge injection” condition exactly (18.5% vs 17.8%, p=0.73). LoRA stacking on quantized MoE models does not inject knowledge. The entire measured gain is a re-quantization artifact.
The Setup
We attempted to inject new factual knowledge into a Qwen3.5-35B-A3B model (hybrid MoE, 128 experts per layer) via LoRA stacking. The target dataset comprised 1,620 MMLU-Pro questions that both the base model and the persona model answer incorrectly, genuine unknowns where any improvement represents new capability, not memorization of previously-known facts.
The factual LoRA targeted layers {8, 14, 20} down_proj (including MoE switch_mlp expert outputs at 3-bit precision with group size 64), rank 2, alpha 2, 200 iterations. On a held-out test set of 692 items from the same distribution (never seen during training), the stacked model achieved 18.5% accuracy, above the ~0% baseline (both models wrong by construction) and above the 10–13% multiple-choice random floor.
This looked like successful knowledge injection. It was entirely an artifact.
Methodology
Dataset Construction
We mined 6,000 stratified MMLU-Pro questions across 14 academic subjects (chemistry, math, business, engineering, physics, law, computer science, health, history, philosophy, economics, psychology, biology, and a pooled “other” category). Both the base model and the persona model were evaluated at temperature 0 with thinking disabled, using a structured answer-first format (\boxed{X}) to ensure clean extraction.
Items where both models answered incorrectly formed “Set G”, 2,312 genuine unknowns. These were split 70/30 by subject-stratified random partition: 1,620 training items and 692 held-out test items. The test set was never seen by any LoRA during training.
Training Protocol
The factual LoRA was trained on the 1,620 Set-G training items converted to instruction format (question + options → \boxed{answer} with one-line justification). Target: layers {8, 14, 20} down_proj including switch_mlp expert output projections. Hyperparameters: rank 2, alpha 2, learning rate 5e-6, batch size 1, 200 iterations, max sequence length 2048. The trained adapter was fused into the persona model and re-quantized back to the original 3-bit precision (group size 64, affine quantization).
Evaluation
All conditions were evaluated on the full held-out test set (n=692) at temperature 0 with thinking disabled. Answer extraction used the structured \boxed{X} format with fallback heuristics. Identity was measured using a 20-prompt consistency evaluation. All thresholds and control designs were pre-registered before any training run.
The Zero-Delta Control (A0)
The decisive control was a zero-delta adapter: a copy of the trained factual adapter with all LoRA weight matrices (A and B tensors) set to zero. This adapter was fused using the identical pipeline, same target layers, same modules, same fuse script, same re-quantization to 3-bit. The only difference: zero learned weights. This isolates the effect of the fuse+requant process itself from any content the LoRA may have learned.
If LoRA training injects real knowledge, the zero-delta control should score near the ~0% both-wrong baseline (it has no learned content). If the gains are an artifact of the fuse-and-requantize process, the zero-delta control will match the trained condition.
Results
| Condition | What It Is | Set-G Accuracy (n=692) | Identity |
|---|---|---|---|
| Persona only (baseline) | No second stage | ≈0% (by construction) | 18 / 20 |
| A1: Factual injection | Persona → Set-G LoRA, fuse+requant | 18.50% | 18 / 20 |
| A0: Zero-delta control | Persona → zero adapter, fuse+requant (identical pipeline, zero learned weights) | 17.77% | 18 / 20 |
| A1 − A0 | Content-specific effect | +0.73 pp (p = 0.73) | , |
The trained factual LoRA (A1) and the zero-delta control (A0) are statistically indistinguishable. The two-proportion z-test gives p=0.73, the null hypothesis that they are identical cannot be rejected. The content-specific effect of training on 1,620 Set-G items is +0.73 percentage points: effectively zero.
What Is Happening
The fuse-and-requantize pipeline, even with a zero learned delta, flips approximately 18% of both-wrong multiple-choice items toward correct. The mechanism is re-quantization rounding noise:
- The target tensors (switch_mlp.down_proj on layers {8,14,20}) are stored at 3-bit precision with group size 64
- The fuse operation dequantizes these tensors to full precision, adds the LoRA delta (zero in the control case), then re-quantizes back to 3-bit
- This dequant→requant round trip introduces rounding noise at every group boundary
- On multiple-choice items where the model is “almost right” (partial knowledge pointing toward the correct answer), even tiny weight perturbations from re-quantization push near-threshold predictions across the decision boundary
With ≤10-option MC items and a model that already has partial knowledge, this re-quantization jitter flips a predictable fraction of items from wrong to right, producing the illusion of knowledge injection where none occurred.
The Decomposition
| Component | Contribution | % of Total |
|---|---|---|
| Re-quantization artifact (fuse+requant floor) | ~17.8 pp | ~96% |
| Content-specific learning | +0.7 pp (p=0.73, not significant) | ~4% |
| Total observed “injection” | 18.5 pp | 100% |
The content-specific component is not statistically distinguishable from zero. The entire measured gain is explained by the act of fusing and re-quantizing.
Why Small Subsets Overstate the Effect
An earlier evaluation on a 200-item subset showed the trained condition at 23.5% and a content-free control at 19.5%, suggesting a ~4pp content effect. When evaluated on the full 692-item held-out set, both conditions collapsed to ~18%, eliminating the apparent gap. The lesson: small test sets produce spurious effect sizes that vanish at adequate sample sizes. At n=200 with 10-option MC, the noise floor is ±3pp, enough to manufacture the illusion of a real effect.
Why This Matters
Any study reporting LoRA stacking gains on quantized models without a zero-delta control is likely measuring a re-quantization artifact. The confound is especially insidious because:
- It produces large, consistent numbers, 18% is well above random chance and looks like a real effect
- It scales with the difficulty of the test set, items where the model is nearly right (but wrong) are most susceptible to boundary-crossing from requant noise
- It requires no training at all, a zero-delta fuse+requant produces the same recovery as 200 iterations of actual training on the target domain
- Standard baselines miss it, comparing “stacked model vs. unstacked model” cannot distinguish content learning from requant noise; only a zero-delta control on the same fuse path isolates the effect
- Lower bit-widths amplify it, the effect is largest on low-precision (3–4 bit) MoE expert layers where re-quantization rounding is most aggressive
The Required Control
For any LoRA experiment on quantized models claiming knowledge injection, the minimum credible design requires:
- Zero-delta control: Fuse a structurally identical adapter with all learned weights set to zero through the same pipeline (same modules, same fuse code, same re-quantization). This isolates the requant floor
- Content-specific effect: Report the delta between your trained condition and the zero-delta control, not the raw gain over the unfused baseline
- Statistical test: Two-proportion test at the full test-set size. Effects below ±3pp on n<500 are indistinguishable from noise
- Full held-out set: Never report results from subsets without confirming on the full test split. Small subsets produce spurious separations that vanish at scale
Pre-Registered Kill Criterion
Before running any control, we pre-registered: “Set-G content contributes real knowledge if and only if A1 − A0 ≥ 5 percentage points AND two-proportion test p < 0.05 at n=692. Otherwise the knowledge-injection line is closed.”
Result: A1 − A0 = +0.73 pp, p = 0.73. The kill criterion is met decisively. The knowledge-injection line is closed on the merits. No threshold was moved post-hoc.
The definitive conclusion: in this regime (rank 2, 200 iterations, 3 MoE layers, 3-bit quantized expert outputs), LoRA stacking does not inject knowledge. The entire measured gain is a re-quantization artifact. The trained content contributes nothing distinguishable from a zero-weight fuse of the same target.
Model: Qwen3.5-35B-A3B (hybrid MoE, 128 experts). Test set: 692 held-out MMLU-Pro items (both models wrong by construction, 14 subjects, stratified split). Training: rank 2/alpha 2, layers {8,14,20} down_proj including switch_mlp, 200 iterations, lr 5e-6, max_seq 2048. Evaluation: temperature 0, thinking disabled, \boxed{X} answer format. Quantization: 3-bit, group size 64, affine. All thresholds and controls pre-registered before training. Statistical test: two-proportion z-test.