NERO-Net: A Neuroevolutionary Approach for the Design of Adversarially Robust CNNs
Researchers use neuroevolution to design CNNs with inherent adversarial robustness, avoiding costly training tricks.
A team of researchers from the University of Coimbra, Portugal, has introduced NERO-Net, a novel neuroevolutionary framework for designing Convolutional Neural Networks (CNNs) with inherent adversarial robustness. The core innovation is a search strategy that isolates the influence of architecture on security by deliberately avoiding adversarial training during the evolutionary loop. Instead, the fitness function promotes candidate models that, even when trained with standard methods, achieve high accuracy after simulated attacks without sacrificing performance on clean data. This approach tackles a critical gap in automated neural architecture search, which has historically prioritized performance over security.
In experiments on the CIFAR-10 dataset, focusing on L∞-robustness, the fittest model evolved by NERO-Net achieved 33% accuracy against the Fast Gradient Sign Method (FGSM) attack while maintaining 87% accuracy on unperturbed samples. This result is significant because the model was not adversarially trained during evolution. Subsequent standard training boosted these metrics to 47% adversarial and 93% clean accuracy, suggesting the discovered architecture possesses intrinsic defensive properties. When the evolved model was later subjected to adversarial training, its accuracy against the powerful AutoAttack benchmark reached 40%, showcasing the combined benefit of a robust design and robust training.
The work represents a paradigm shift from merely training models to be robust to actively evolving architectures that are robust by design. By decoupling architectural search from adversarial training, NERO-Net provides a clearer signal for what makes a network inherently secure. This method could lead to more reliable AI systems for autonomous vehicles, medical imaging, and financial security, where adversarial examples pose a serious threat. The research, detailed in the arXiv preprint 2603.25517, opens a new avenue for building trustworthy AI from the ground up.
- NERO-Net uses neuroevolution to design CNN architectures with inherent adversarial robustness, avoiding adversarial training during the search phase.
- The best-evolved model achieved 33% accuracy against FGSM attacks and 87% clean accuracy on CIFAR-10 using only standard training.
- The work demonstrates that robust architectures can be discovered, leading to models that are more secure by design for safety-critical applications.
Why It Matters
This enables the creation of AI models that are secure by architectural design, not just by training, crucial for safety-critical systems like autonomous vehicles.