New RMT framework predicts when NNs can learn XOR via spikes
Quadratic equivalents reveal phase transitions in nonlinear feature maps for XOR classification.
A new paper from Collin Cranston, Zhichao Wang, Todd Kemp, and Michael W. Mahoney tackles a fundamental question: when can a neural network's feature map be used for linear classification of nonlinearly separable data? They focus on the conjugate kernel (CK) of a feedforward network applied to the classic XOR problem.
The authors develop a robust quadratic equivalent to the spiked CK matrix, extending deterministic equivalents from random matrix theory. This allows them to analyze the emergence of informative outlier eigenvalues and their eigenvectors' alignment with XOR labels. They derive precise BBP-type phase transitions as functions of sample complexity, signal-to-noise ratio, activation function, and pretrained features. The result is a theoretical framework predicting when linear methods on CK eigenvectors can succeed.
This work bridges the gap between classical RMT and practical deep learning, providing tools to understand nonlinear learnability in high dimensions.
- Develops quadratic equivalents for conjugate kernels on the XOR problem, enabling precise analysis of spike emergence.
- Derives BBP-type phase transitions depending on sample complexity, SNR, activation choice, and pretrained features.
- Shows conditions under which linear classification via CK eigenvectors becomes possible for nonlinearly separable data.
Why It Matters
Provides theoretical grounding for when neural networks can linearly separate nonlinear data, advancing interpretability and design.