Awakening Dormant Experts:Counterfactual Routing to Mitigate MoE Hallucinations
A training-free inference framework boosts factual accuracy by 3.1% without increasing compute costs.
A research team from multiple institutions has introduced Counterfactual Routing (CoR), a novel, training-free inference framework designed to tackle a core weakness in Sparse Mixture-of-Experts (MoE) models: their tendency to hallucinate on long-tail or rare factual knowledge. The paper identifies that the standard Top-k routing mechanism, which selects a fixed number of 'expert' sub-networks for each input, creates a bias toward high-frequency patterns. This leaves specialized experts with critical, niche knowledge under-prioritized or 'dormant,' directly contributing to factual errors.
CoR addresses this by using a layer-wise perturbation analysis and a new Counterfactual Expert Impact (CEI) metric. During inference, the system virtually 'ablates' or removes experts to measure their causal importance, then dynamically reallocates computational resources from syntax-focused layers to knowledge-intensive ones. Crucially, it maintains the same total number of activated experts (the compute budget), making it a zero-cost upgrade. Extensive testing on TruthfulQA, FACTOR, and TriviaQA showed CoR improves average factual accuracy by 3.1%, establishing a superior performance-efficiency trade-off compared to simply scaling the model size.
- Identifies 'dormant expert' problem in MoE models where static Top-k routing misses rare factual knowledge.
- Proposes Counterfactual Routing (CoR), a training-free inference method using virtual ablation to measure expert impact.
- Boosts factual accuracy by 3.1% on key benchmarks without increasing the computational cost of inference.
Why It Matters
Enables more reliable, factual AI from large models like GPT-4 and Mixtral without expensive retraining or more compute.