On Benchmark Hacking in ML Contests: Modeling, Insights and Design
Researchers reveal why low-performing teams cheat on benchmarks and how to stop them.
A new paper from researchers Xiaoyun Qiu, Yang Yu, and Haifeng Xu, published on arXiv, tackles the growing problem of benchmark hacking in machine learning contests. The authors model the contest as a game where each contestant chooses between two types of effort: creative effort, which improves model generalization as intended by the host, and mechanistic effort, which only optimizes for the specific contest benchmark without improving true capability. This is the first formal economic model of this phenomenon, revealing that benchmark hacking is not just a technical flaw but a strategic equilibrium outcome.
The paper's key insight is that contestants below a certain ability threshold will always engage in benchmark hacking, while high-ability contestants do not. This provides a natural, game-theoretic definition of hacking. Counterintuitively, the authors find that more skewed reward structures—like winner-take-all prizes—can actually reduce hacking by incentivizing top players to invest in creative effort. The findings offer concrete guidance for contest designers: to minimize hacking, focus reward on top performers rather than spreading prizes evenly. The paper includes empirical evidence supporting these predictions, making it a must-read for anyone hosting or participating in ML competitions.
- Contestants below a specific ability threshold will always choose benchmark hacking over genuine improvement.
- Winner-take-all reward structures can reduce hacking by incentivizing top players to invest in creative effort.
- The paper provides the first game-theoretic model of benchmark hacking, defining it as a strategic equilibrium outcome.
Why It Matters
Provides a formal framework to design ML contests that reward real innovation, not just benchmark optimization.