Universal priors: solving empirical Bayes via Bayesian inference and pretraining
New paper shows pretrained transformers achieve near-optimal O(1/n) regret on statistical problems.
Deep Dive
Researchers from MIT and Stanford published "Universal Priors," proving transformers pretrained on synthetic data can solve empirical Bayes problems. The theoretical work shows these models achieve a near-optimal regret bound of Õ(1/n) across all test distributions. This explains why models like those from Teh et al. (2025) generalize beyond their training data, performing Bayesian inference through posterior contraction to adapt to new statistical tasks.
Why It Matters
Provides a mathematical foundation for why LLMs generalize, guiding development of more robust and statistically sound AI agents.