PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning
New poisoning method evades detection while reducing model accuracy by less than 9%.
A research team from multiple Chinese institutions has unveiled PoiCGAN, a sophisticated new attack method that poses a significant threat to Federated Learning (FL) systems. FL is a popular distributed AI training paradigm where multiple clients (like smartphones or hospitals) collaboratively train a model without sharing their raw data, thus protecting privacy. However, this distributed nature makes it vulnerable if a malicious client participates. PoiCGAN specifically targets this weakness by poisoning the training process in a highly stealthy way.
Unlike previous poisoning attacks that were often detected by model performance checks or anomaly detection defenses, PoiCGAN manipulates a Conditional Generative Adversarial Network (CGAN). The attack modifies the inputs to the CGAN's generator and discriminator to create a 'poison generator.' This generator produces subtly corrupted data samples and automatically flips their labels (e.g., changing a 'cat' label to 'dog'), all while keeping the model's overall performance on its main task nearly intact.
Experiments across multiple datasets demonstrated the attack's potency and stealth. PoiCGAN achieved an attack success rate that was 83.97% higher than existing baseline methods. Crucially, it did this while degrading the model's accuracy on its intended classification task by less than 8.87%, making the malicious activity difficult to spot through standard performance monitoring. The poisoned samples and the resulting compromised models exhibit what the authors term 'high stealthiness,' allowing them to potentially bypass current defensive filters in FL frameworks.
- PoiCGAN achieves an 83.97% higher attack success rate than previous poisoning methods for Federated Learning.
- The attack reduces the main model's accuracy by less than 8.87%, making it highly stealthy and hard to detect via performance drops.
- It uses a modified Conditional GAN (CGAN) to generate poisoned data and automatically perform label flipping, evading anomaly-based defenses.
Why It Matters
This exposes a critical security flaw in privacy-preserving AI, forcing developers to create stronger defenses for distributed systems like healthcare or finance.