Identity Survives LoRA Stacking
MoE Research

Identity Survives LoRA Stacking: Persona Preservation in Multi-Stage Fine-Tuning

May 2026 · Black Sheep AI Research

Can you stack a second LoRA , targeting factual knowledge injection , onto an identity-trained model without destroying the persona? Yes. Identity holds at baseline across every condition we tested.

The Question

LoRA stacking is increasingly common in production: a base model receives one adapter for persona/style, then a second for domain knowledge, tool use, or factual grounding. The fear is that the second stage will overwrite or erode the first , particularly when targeting overlapping layers.

We tested this directly on Qwen3.5-35B-A3B, a hybrid Mixture-of-Experts model with 128 experts per MoE layer. The identity LoRA establishes a distinct persona with consistent self-identification behaviour. The factual LoRA targets the same MoE layers ({8, 14, 20} down_proj, including switch_mlp experts) with knowledge from 1,620 MMLU-Pro items that both the base model and the persona model answer incorrectly.

Methodology

We evaluated identity consistency using a 20-prompt evaluation suite that tests self-identification, persona boundaries, and behavioural consistency. The scorer is discriminating , it correctly identifies absence of identity on untrained baselines (scoring 2/20 on the base model with no identity training), so high scores on stacked models represent genuine preservation, not a rubber stamp.

All evaluations ran with thinking disabled, temperature 0, ensuring deterministic outputs. The identity LoRA alone scores 18/20 , this is the reference baseline that any stacked condition must match.

Results

Condition Stage 1 Stage 2 Identity (/ 20)
Identity only (reference) Identity LoRA , 18 / 20
Identity + Factual Identity LoRA Set-G factual LoRA 18 / 20
Identity + Identity (re-train) Identity LoRA Identity LoRA again 18 / 20
Identity + Factual (rank 8) Identity LoRA Set-G LoRA, rank 8 18 / 20
Base + Factual (no identity) , Set-G factual LoRA 2 / 20

Every condition that included an identity stage holds identity at exactly the reference baseline: 18/20. The factual LoRA , whether rank 2 or rank 8, whether trained on knowledge items or on identity data again , doesn't erode the persona.

The Control That Validates It

The "Base + Factual" condition (bottom row) is the critical validation. This model received factual training on the same data, same layers, same hyperparameters , but no identity training. It scores 2/20 on identity. The scorer genuinely discriminates between present and absent identity, confirming that 18/20 elsewhere's real signal, not an evaluation artifact.

Why This Works

The identity LoRA and the factual LoRA target the same layers but operate on different aspects of the weight space. The identity adapter shapes generation style and self-referential behaviour. The factual adapter shifts answer distributions on specific multiple-choice items. At rank 2 with alpha 2, the factual LoRA's weight perturbation is small enough to preserve the identity signal established in the first stage.

This isn't guaranteed at arbitrary scale , a very large second-stage LoRA could plausibly overwrite earlier training. But within the regime we tested (rank 2–8, 200 training iterations, targeting MoE expert output projections on 3 layers), identity is solid.

Practical Implications

Broader Context

We also confirmed that the identity LoRA causes no broad catastrophic forgetting on MMLU-Pro. Across 6,000 stratified questions, the persona model scores net +79 compared to base (59.2% → 60.5%), fixing 135 questions while breaking only 56. The earlier ~6% drop observed on narrow AI-identity-adjacent questions doesn't generalise to broad academic benchmarks.

The combination , identity preserved under stacking, no broad regression from identity training itself , makes multi-stage LoRA pipelines viable for production persona deployment on MoE models.


Model: Qwen3.5-35B-A3B (hybrid MoE, 128 experts). Evaluation: 20-prompt identity consistency suite, MMLU-Pro (6,000 stratified items), thinking disabled, temperature 0. Training: rank 2/alpha 2, layers {8,14,20} down_proj including switch_mlp, 200 iterations.

Continue Reading

Related research from our team.

The Stacking Confound
MoE Research

The Stacking Confound: Why LoRA Recovery Numbers Lie

~80% of apparent knowledge injection is a weight-perturbation artifact, not learned facts.

Metal Buffer Limits
MoE Research

Metal Buffer Limits Block LoRA Scaling on MoE Models

A hard 499,000 buffer-count ceiling prevents training LoRA rank >2 on 128-expert MoE layers.

MoE Routing Layers Converge
MoE Research

MoE Routing Layers Converge Across Subjects

Per-subject top-3 routing layers collapse to a shared backbone. Domain-specific targeting offers no advantage.

View All Research