Accidentally discovered you can teach frozen MoE models new knowledge by just steering their expert routing — no training needed
A 154KB routing file teaches Gemma 4 new knowledge without any weight training or RAG.
A solo developer, vignesan, has made a potentially groundbreaking discovery in how to update large language models without traditional training. While probing the expert routing patterns of Google's Gemma 4 MoE model, they noticed a distinct difference in how experts are activated when the model confidently knows information versus when it does not. By recording this 'knowing' routing signature and replaying it for a topic the model was wrong about, they created a method called Adaptive Cognitive Intelligence (ACI). This technique effectively 'steers' the frozen model's internal computation path, forcing it to utilize the correct combination of its pre-existing expert sub-networks to produce accurate output.
The discovery is packaged as an early alpha tool called 'mnemic-mRE' (Memory via Routing Emulation), available on GitHub. In a striking example, applying a mere 154KB routing file transformed Gemma 4's output on a video game fact from a long, confidently incorrect 181-word denial that 'Starfield is NOT available on PS5' into a concise, correct 31-word statement noting a 'PS5 release in April 2026 with DualSense haptics.' This was achieved with zero modification to the model's 26 billion parameters, no Low-Rank Adaptation (LoRA), and no retrieval-augmented generation (RAG) pipeline—just a surgical intervention in the routing logic.
If validated and scaled, this approach could revolutionize model updating and personalization. It suggests a future where correcting model hallucinations or adding new knowledge could be as simple as distributing tiny routing patch files, bypassing the massive computational cost of fine-tuning. This would make LLMs far more agile and customizable for enterprises and individuals alike, enabling real-time factual updates without retraining.
- Method called Adaptive Cognitive Intelligence (ACI) edits knowledge by replaying recorded expert routing patterns, not weights.
- Demonstrated on Gemma 4 26B: a 154KB file fixed a factual error, reducing output from 181 wrong words to 31 correct ones.
- Requires no fine-tuning, LoRA, or RAG—just steering the frozen model's internal routing, a radically efficient update mechanism.
Why It Matters
Could enable instant, ultra-efficient model updates and factual corrections, drastically reducing the cost and latency of keeping LLMs current.