Identity Survives LoRA Stacking: Persona Preservation in Multi-Stage Fine-Tuning

If you stack a second LoRA for factual knowledge injection onto an identity-trained model, does the persona survive? Short answer: yes. Identity held at baseline in every condition we threw at it.

The Question

LoRA stacking keeps showing up in production pipelines. You train one adapter for persona and style, then layer on a second for domain knowledge, tool use, or factual grounding. The worry is obvious: does that second stage clobber the first? Especially when both adapters touch the same layers.

We went after this question directly on Qwen3.5-35B-A3B, a hybrid Mixture-of-Experts model with 128 experts per MoE layer. Our identity LoRA gives the model a distinct persona with consistent self-identification behaviour. The factual LoRA hits the same MoE layers ({8, 14, 20} down_proj, including switch_mlp experts), trained on 1,620 MMLU-Pro items that both the base model and the persona model get wrong.

Methodology

We measured identity consistency with a 20-prompt evaluation suite covering self-identification, persona boundaries, and behavioural consistency. The scorer isn't a pushover. It correctly flags the absence of identity on untrained baselines, giving the bare base model just 2/20. So when a stacked model scores high, that actually means something. It's not a rubber stamp.

Every evaluation ran with thinking disabled and temperature 0 for deterministic outputs. The identity LoRA on its own scores 18/20. That's our reference baseline, and any stacked condition needs to match it.

Results

Condition	Stage 1	Stage 2	Identity (/ 20)
Identity only (reference)	Identity LoRA	,	18 / 20
Identity + Factual	Identity LoRA	Set-G factual LoRA	18 / 20
Identity + Identity (re-train)	Identity LoRA	Identity LoRA again	18 / 20
Identity + Factual (rank 8)	Identity LoRA	Set-G LoRA, rank 8	18 / 20
Base + Factual (no identity)	,	Set-G factual LoRA	2 / 20

Every condition that included an identity stage came in at exactly the reference baseline: 18/20. Didn't matter whether the factual LoRA was rank 2 or rank 8, or whether we trained on knowledge items or ran identity data through a second time. The persona didn't budge.

The Control That Validates It

Look at the bottom row: “Base + Factual.” That model got the same factual training, same layers, same hyperparameters, but no identity training. It scores 2/20 on identity. This is what makes the other results credible. The scorer genuinely tells the difference between a model that has a persona and one that doesn't. 18/20 elsewhere is real signal, not an evaluation artifact.

Why This Works

Both LoRAs target the same layers, but they're working on different parts of the weight space. The identity adapter shapes how the model talks and refers to itself. The factual adapter nudges answer distributions on specific multiple-choice items. With rank 2 and alpha 2, the factual LoRA’s weight perturbation just isn't big enough to stomp on the identity signal from the first stage.

That said, this isn't guaranteed at any scale. A much larger second-stage LoRA could absolutely overwrite earlier training. But in the regime we tested (rank 2–8, 200 training iterations, targeting MoE expert output projections across 3 layers), identity holds up reliably.

Practical Implications

You can stack domain-knowledge LoRAs onto persona models safely, at least at moderate ranks on MoE architectures
Identity and factual knowledge occupy different functional subspaces even when they target the same layers
Always include a no-identity control when evaluating stacking. Without it, you can't tell real preservation apart from a lazy evaluation
The identity LoRA doesn't suppress factual learning. The stacked model recovers facts at least as well as the base model with the same factual training

Broader Context

We also checked whether the identity LoRA causes broad catastrophic forgetting on MMLU-Pro. It doesn't. Across 6,000 stratified questions, the persona model actually comes out net +79 versus base (59.2% → 60.5%), fixing 135 questions and breaking just 56. That ~6% drop we saw earlier on narrow AI-identity-adjacent questions? It doesn't generalise to broad academic benchmarks.

Put it together and the picture is clear: identity survives stacking, and identity training itself doesn't cause broad regression. Multi-stage LoRA pipelines are a viable path for deploying personas on MoE models in production.

Model: Qwen3.5-35B-A3B (hybrid MoE, 128 experts). Evaluation: 20-prompt identity consistency suite, MMLU-Pro (6,000 stratified items), thinking disabled, temperature 0. Training: rank 2/alpha 2, layers {8,14,20} down_proj including switch_mlp, 200 iterations.

Identity Survives LoRA Stacking: Persona Preservation in Multi-Stage Fine-Tuning

The Question

Methodology

Results

The Control That Validates It

Why This Works

Practical Implications

Broader Context

Continue Reading

The Stacking Confound: Why LoRA Recovery Numbers Lie

Metal Buffer Limits Block LoRA Scaling on MoE Models

MoE Routing Layers Converge Across Subjects