Adaptive Estimation and Inference in Semi-parametric Heterogeneous Clustered Multitask Learning via Neyman Orthogonality
Recovers latent clusters with oracle-like accuracy, even with heterogeneous nuisance parameters.
A new paper accepted at ICML 2026 tackles clustered multitask learning in a semiparametric setting where tasks share a latent cluster structure but differ in nuisance parameters (e.g., infinite-dimensional components). Authors Hanxiao Chen and Debarghya Mukherjee propose an adaptive fused orthogonal estimator that combines Neyman-orthogonal losses with pairwise fusion penalties calibrated via task-specific pilot estimates. This approach mitigates nuisance-parameter estimation error and allows exact latent cluster recovery with high probability.
Theoretically, the estimator achieves pooled parametric convergence rates proportional to cluster size and asymptotic normality, matching an oracle that knows true clusters. Empirically, it consistently outperforms baselines across simulations. A real-world application to U.S. residential energy consumption electricity price elasticity revealed meaningful regional clusters, demonstrating practical utility for economists and energy policymakers.
- Proposes adaptive fused orthogonal estimator using Neyman-orthogonal losses for latent cluster recovery
- Achieves exact cluster recovery with high probability and pooled parametric rates proportional to cluster size
- Outperforms strong baselines on simulations and real U.S. residential energy consumption data (electricity price elasticity)
Why It Matters
Enables robust multitask learning with heterogeneous nuisance, improving clustering accuracy for high-impact applications like energy policy.