New AM-PPI method cuts healthcare AI monitoring cost by 40%
Researchers achieve 10-40% narrower confidence intervals with adaptive multi-predictor routing.
Monitoring healthcare AI after deployment requires statistically valid, label-efficient methods, but gold-standard labels from clinician chart review are expensive. Existing approaches like Prediction-Powered Inference (PPI) and Active Statistical Inference (ASI) rely on a single predictor, which is a poor fit for modern clinical pipelines that often have multiple models with varying cost and accuracy.
AM-PPI solves this by adaptively routing each instance to a cost-appropriate subset of predictors, sampling gold-standard labels in proportion to the chosen subset's residual uncertainty, and reweighting predictions to minimize variance. The method provides strong theoretical guarantees (asymptotic normality, minimum variance unbiasedness) and empirically achieves 10–40% narrower confidence intervals than single-predictor ASI in budget-constrained regimes, with no degradation where routing isn't needed.
- AM-PPI generalizes ASI to leverage multiple predictors of differing cost and accuracy, per-instance adaptive routing.
- The method uses closed-form KKT conditions to optimize three decisions: which subsets to route, how to sample labels, and how to reweight predictions.
- On synthetic and three real healthcare monitoring tasks, AM-PPI produces 10–40% narrower confidence intervals than single-predictor baselines.
Why It Matters
Enables cost-effective, statistically valid monitoring of multiple AI models in clinical settings, reducing the need for expensive manual labels.