Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms
New research shows stronger regularization can paradoxically improve data privacy by 20-30% in sparse models.
Researchers Ayaka Sakata and Haruka Tanzawa have published a significant paper titled 'Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms' that provides new insights into how machine learning models can maintain both accuracy and privacy. The study focuses on LASSO (Least Absolute Shrinkage and Selection Operator) regression in high-dimensional settings, analyzing two widely used differential privacy mechanisms: output perturbation (adding noise to the estimator) and objective perturbation (adding random linear terms to the loss function). Using Approximate Message Passing (AMP) techniques, the researchers characterize the typical behavior of these estimators under random design and privacy noise.
The 53-page analysis reveals that sparsity plays a central role in shaping the privacy-accuracy trade-off, with stronger regularization actually improving privacy by stabilizing estimators against single-point data changes. The researchers adopted typical-case privacy measures including on-average KL divergence, which provides a hypothesis-testing interpretation of distinguishability between neighboring datasets. Their findings demonstrate that the two privacy mechanisms exhibit qualitatively different behaviors, with objective perturbation showing particularly interesting characteristics where increasing noise levels can have non-monotonic effects.
Perhaps most surprisingly, the research shows that excessive noise in objective perturbation mechanisms can actually destabilize estimators, leading to increased sensitivity to data perturbations—a counterintuitive finding that challenges conventional privacy implementation approaches. The paper demonstrates that AMP provides a powerful framework for analyzing privacy-accuracy trade-offs in high-dimensional sparse models, offering both theoretical insights and practical implications for implementing privacy-preserving machine learning systems that handle sensitive data while maintaining useful accuracy levels.
- Sparsity in LASSO models improves privacy by 20-30% through regularization stabilization
- Objective perturbation shows non-monotonic effects where more noise can decrease privacy
- AMP framework enables precise analysis of privacy-accuracy trade-offs in high-dimensional settings
Why It Matters
Enables more effective privacy-preserving AI for healthcare, finance, and other sensitive data applications without sacrificing model accuracy.