Exploratory Sampling boosts LLM diversity with <5% overhead
New decoding method ESamp uses prediction errors to explore novel semantic patterns.
Researchers from Shanghai Jiao Tong University and ByteDance have introduced Exploratory Sampling (ESamp), a novel decoding method that explicitly encourages semantic diversity in large language model (LLM) generation. Unlike standard stochastic sampling, which mainly produces surface-level lexical variation, ESamp leverages a fundamental property of neural networks: they make lower-error predictions on familiar inputs and higher errors on novel ones. The method trains a lightweight Distiller model at test time to predict deep-layer hidden representations from shallow-layer ones, modeling the LLM's depth-wise representation transitions. During decoding, the Distiller continuously adapts to the current generation context, using prediction error as a novelty signal to reweight candidate token extensions toward less-explored semantic patterns. ESamp is implemented with an asynchronous training-inference pipeline, achieving less than 5% worst-case overhead (1.2% in the optimized release).
Empirical results demonstrate that ESamp significantly boosts Pass@k efficiency across reasoning, mathematics, science, and code generation benchmarks, showing superior or comparable performance to strong stochastic and heuristic baselines. Notably, it breaks the traditional trade-off between diversity and coherence in creative writing tasks, enabling models to generate more varied outputs without sacrificing quality. The method's robust generalization across domains suggests it could become a standard tool for test-time scaling, particularly for applications requiring diverse solution exploration such as code generation and scientific discovery. The code has been released on GitHub.
- ESamp uses a lightweight Distiller to predict hidden-layer representations, achieving under 5% overhead
- Boosts Pass@k efficiency on reasoning, math, science, and code benchmarks
- Breaks the diversity-coherence trade-off in creative writing tasks
Why It Matters
Enables LLMs to explore diverse solutions at test time, improving reasoning and creativity with minimal compute.