Pre-training improves high-dimensional kernel density estimation
Pre-training—a staple in AI—now applied to classic statistical density estimation with promising results.
A new paper from Zhang and Deng applies the pre-training paradigm—borrowed from modern AI—to the classic problem of kernel density estimation (KDE) in high dimensions. Traditional KDE struggles there because fixed or globally tuned kernels fail to adapt to local data density. The authors propose a neural network that is pre-trained on a family of distributions to output optimal location-adaptive bandwidths for each sample point. This yields efficient, adaptive KDE that significantly improves accuracy in settings where the target distribution resembles the pre-training family.
Numerical experiments confirm the strategy's effectiveness, with substantial gains over standard KDE. However, when the target distribution diverges from the pre-training family, the benefit diminishes. To address this, the authors add a fine-tuning step that re-adapts the network to the new data, restoring performance. The work bridges non-parametric statistics and deep learning, offering a practical way to leverage pre-trained models for density estimation in areas like anomaly detection, finance, and scientific computing.
- Pre-trained neural network recommends location-adaptive kernels for each sample point, enhancing KDE accuracy in high dimensions.
- Method excels when target distribution is similar to the pre-training family; accuracy drops for mismatched distributions.
- A fine-tuning procedure reactivates pre-training benefits, making the approach robust to distribution shift.
Why It Matters
Brings AI's pre-training paradigm to non-parametric stats, enabling better density estimation in high-dimensional real-world data.