Research & Papers

Cohen et al. improve distribution estimation bounds under ℓ∞ norm

New paper resolves open questions on minimax and tail bounds for discrete distributions.

Deep Dive

A team of researchers – Doron Cohen, Aryeh Kontorovich, and Yonatan Livshitz – has released a new paper on arXiv that pushes forward the state of the art in distribution estimation under the ℓ∞ norm. The ℓ∞ norm measures the maximum absolute error across all categories of a discrete probability distribution, making it a crucial metric for tasks where uniform accuracy matters, such as language model token distributions or categorical data modeling. The paper presents novel minimax bounds in expectation and high-probability tail bounds, which are significantly tighter than previously known results.

The work directly addresses open questions raised by Kontorovich and Painsky in a 2025 JMLR paper. In particular, the authors derive a fully empirical version of the tightest risk bound, which means practitioners can now compute data-dependent confidence intervals without knowledge of the underlying distribution. They also characterize the exact form of the worst-case extremal distribution, providing deeper theoretical insight. Encouraging empirical results on synthetic data suggest the new bounds are practically useful. For machine learning engineers and statisticians working on discrete probability estimation, this paper offers both theoretical guarantees and actionable tools.

Key Points
  • Resolves open problems from Kontorovich & Painsky (JMLR 2025) regarding minimax bounds under ℓ∞.
  • Provides a fully empirical version of the tightest risk bound, enabling data-driven confidence intervals.
  • Identifies the form of the worst-case extremal distribution, closing a key theoretical gap.

Why It Matters

Better ℓ∞ distribution bounds directly improve uncertainty quantification for categorical data models and AI safety.