Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise
New 'plug-and-play' method improves any classifier's robustness to bad labels without extra data or model changes.
A research team from Washington State University and Texas A&M University has introduced Conformal Margin Risk Minimization (CMRM), a novel framework designed to solve a persistent problem in machine learning: training reliable models with noisy, incorrectly labeled data. Unlike existing methods that require extra resources—such as a pre-trained feature extractor, a clean data subset, or a known 'noise transition matrix'—CMRM is a plug-and-play solution. It works by adding a single, quantile-calibrated regularization term to any standard classification loss function, focusing the model's learning on high-confidence examples while suppressing the influence of likely mislabeled ones. This approach requires no changes to the underlying training pipeline.
The core innovation is its use of conformal prediction to dynamically set a confidence threshold per training batch. This threshold identifies which samples have a reliable margin—the gap between the model's confidence in the given label and its next-best guess. By concentrating on these high-margin samples, CMRM effectively filters out label noise during training. The team provided a theoretical learning bound proving the method's robustness under arbitrary label noise.
Empirical results are compelling. The framework was tested by enhancing five different base learning methods across six benchmark datasets with both synthetic and real-world noise. It consistently improved test accuracy (by up to +3.39%), produced more precise and certain predictions (reducing conformal prediction set sizes by up to -20.44%), and crucially, did not degrade performance when trained on perfectly clean data. This demonstrates that CMRM captures a fundamental, method-agnostic signal about data uncertainty that previous techniques missed. The work has been accepted at the prestigious AISTATS 2026 conference.
- Acts as a universal 'envelope,' improving any base loss function without pipeline modifications or privileged knowledge.
- Uses conformal quantiles per batch to focus training on reliable samples, boosting accuracy by up to 3.39% on noisy benchmarks.
- Also improves prediction certainty, reducing the size of conformal prediction sets by over 20%, indicating higher confidence outputs.
Why It Matters
Enables more reliable AI model training on real-world, messy datasets where perfect labels are expensive or impossible to obtain.