New Framework Tightens Conformal Prediction with Beta Laws and Wasserstein Distances
Finite-sample beta distribution provides exact calibration coverage diagnostics beyond marginal averages.
Traditional split conformal prediction guarantees marginal coverage over random calibration samples, but practitioners often need to understand coverage given a realized threshold. This paper by Ramos, Graziadei, and Cabezas (arXiv:2605.19024) derives the exact finite-sample distribution of calibration-conditional coverage under continuous i.i.d. data: it follows a Beta(k, n+1-k) distribution, where k is the number of calibration points and n is total sample size. The authors treat this beta law as a reference object and use Wasserstein distances to measure how different data-generating processes deform it.
The framework provides direct bounds on marginal coverage gaps and bad-calibration probabilities, isolating two sources of non-i.i.d. behavior: test-side shift acts through a transport map on the coverage scale, while calibration dependence changes the order-statistic law itself. The approach is instantiated for scale-shift, clustered, and stationary mixing settings, with explicit characterizations or Berry-Esseen approximations. Simulations on dependent processes show the first-order approximation tracks empirical Wasserstein distances even at moderate sample sizes, offering a practical tool for uncertainty quantification in machine learning.
- Calibration-conditional coverage in i.i.d. settings follows an exact Beta(k, n+1-k) distribution, providing a finite-sample reference.
- Wasserstein distances on [0,1] measure departures from this beta law, yielding bounds on marginal coverage gaps and bad-calibration probabilities.
- Framework separates test-side shift (transport map) from calibration dependence, with explicit characterizations for scale-shift, clustered, and stationary mixing processes.
Why It Matters
Delivers precise calibration diagnostics for machine learning, improving the reliability of uncertainty quantification in real-world deployment.