Active inference framework boosts U-statistics efficiency under label budgets
New method cuts labeling costs while preserving statistical validity in U-statistics.
Wang et al. introduce an active inference framework for U‑statistics that selectively queries informative labels under a fixed budget. Their augmented inverse probability weighting estimator incorporates sampling rules with machine learning predictions. They characterize the optimal sampling rule that minimizes variance and design practical strategies. Experiments on real datasets demonstrate substantial gains in estimation efficiency over baseline methods while maintaining target coverage.
- Augmented inverse probability weighting U-statistic integrates sampling rules and ML predictions for unbiased estimation.
- Optimal sampling rule derived to minimize estimator variance under a fixed labeling budget.
- Framework extends to U‑statistic-based empirical risk minimization, showing up to 40% MSE reduction on real data.
Why It Matters
Enables cost‑effective statistical inference with U‑statistics, critical for modern data‑hungry applications like pairwise learning and graph analysis.