Research & Papers

CONTRA: New method sharpens multi-dimensional prediction regions with guaranteed coverage

Normalizing flows meet conformal prediction for tighter, non-rectangular output regions.

Deep Dive

Conformal prediction provides statistically guaranteed coverage, but its reliance on one-dimensional nonconformity scores severely limits its utility for multi-dimensional outputs—often forcing crude, rigid prediction regions like hyperrectangles or ellipses. CONTRA (CONformal prediction region via normalizing flow TRAnsformation) tackles this by leveraging the latent space of normalizing flows. It calculates nonconformity scores as distances from the center of the flow's latent distribution, then maps the resulting high-density latent regions back to the original output space. This produces sharp, flexible prediction regions that faithfully capture the true underlying data distribution, delivering guaranteed coverage without the rectangular or elliptical distortions of prior approaches.

Beyond pure flow-based models, CONTRA can wrap around any existing predictor—by training a lightweight normalizing flow on the model's residuals. This extension allows practitioners to add reliable, multi-dimensional prediction intervals to their pre-trained regression or classification models with minimal overhead. Experiments on multiple datasets confirm that both CONTRA and its residual-based variant maintain exact coverage guarantees while significantly reducing region volume compared to baselines. For professionals deploying machine learning in high-stakes domains (healthcare, finance, autonomous systems), CONTRA offers a principled way to generate trustworthy, compact uncertainty estimates for complex outputs.

Key Points
  • CONTRA uses normalizing flow latent spaces to define nonconformity scores based on distance from center, enabling non-rectangular prediction regions.
  • Extension trains a simple flow on residuals to add conformal prediction regions to any existing model while maintaining guaranteed coverage.
  • Outperforms traditional conformal methods (hyperrectangular/elliptical) across multiple datasets with sharper regions and same coverage probability.

Why It Matters

Enables reliable, compact prediction intervals for high-dimensional outputs—critical for safety-critical ML applications.