Open Source

Accidentally discovered you can teach frozen MoE models new knowledge by just steering their expert routing — no training needed

r/LocalLLaMA April 18, 2026

⚡A 154KB routing file teaches Gemma 4 new knowledge without any weight training or RAG.

Deep Dive

A solo developer, vignesan, has made a potentially groundbreaking discovery in how to update large language models without traditional training. While probing the expert routing patterns of Google's Gemma 4 MoE model, they noticed a distinct difference in how experts are activated when the model confidently knows information versus when it does not. By recording this 'knowing' routing signature and replaying it for a topic the model was wrong about, they created a method called Adaptive Cognitive Intelligence (ACI). This technique effectively 'steers' the frozen model's internal computation path, forcing it to utilize the correct combination of its pre-existing expert sub-networks to produce accurate output.

The discovery is packaged as an early alpha tool called 'mnemic-mRE' (Memory via Routing Emulation), available on GitHub. In a striking example, applying a mere 154KB routing file transformed Gemma 4's output on a video game fact from a long, confidently incorrect 181-word denial that 'Starfield is NOT available on PS5' into a concise, correct 31-word statement noting a 'PS5 release in April 2026 with DualSense haptics.' This was achieved with zero modification to the model's 26 billion parameters, no Low-Rank Adaptation (LoRA), and no retrieval-augmented generation (RAG) pipeline—just a surgical intervention in the routing logic.

If validated and scaled, this approach could revolutionize model updating and personalization. It suggests a future where correcting model hallucinations or adding new knowledge could be as simple as distributing tiny routing patch files, bypassing the massive computational cost of fine-tuning. This would make LLMs far more agile and customizable for enterprises and individuals alike, enabling real-time factual updates without retraining.

Key Points

Method called Adaptive Cognitive Intelligence (ACI) edits knowledge by replaying recorded expert routing patterns, not weights.
Demonstrated on Gemma 4 26B: a 154KB file fixed a factual error, reducing output from 181 wrong words to 31 correct ones.
Requires no fine-tuning, LoRA, or RAG—just steering the frozen model's internal routing, a radically efficient update mechanism.

Why It Matters

Could enable instant, ultra-efficient model updates and factual corrections, drastically reducing the cost and latency of keeping LLMs current.

Read Original Article

Accidentally discovered you can teach frozen MoE models new knowledge by just steering their expert routing — no training needed

Why It Matters

Stay Ahead in AI