Neural networks exhibit a strong simplicity bias?

they first learn amplitude features (pairwise pixel correlations) before phase features (edges, higher-order correlations).

For isotropic high-dimensional inputs, online SGD requires at least N^3 steps to learn phase information; power-law spectra can dramatically accelerate this process?

For isotropic high-dimensional inputs, online SGD requires at least N^3 steps to learn phase information; power-law spectra can dramatically accelerate this process.

Experiments on two-layer networks and deep CNNs (ImageNet, CIFAR-100) confirm that natural image power-law spectra enable efficient phase learning, explaining deep network performance on real-world images?

Experiments on two-layer networks and deep CNNs (ImageNet, CIFAR-100) confirm that natural image power-law spectra enable efficient phase learning, explaining deep network performance on real-world images.

Research & Papers

Fourier analysis reveals why neural networks learn phase information so slowly

arXiv stat.ML May 19, 2026

⚡New study shows neural networks need at least N^3 steps to learn phase features from isotropic inputs

Deep Dive

A new theoretical paper from Fabiola Ricci, Claudia Merger, and Sebastian Goldt (arXiv:2605.16913, submitted to ICML 2026) provides a Fourier perspective on the learning dynamics of neural networks, specifically their well-known simplicity bias. The authors show that neural networks trained with gradient descent sequentially exploit different frequency components of the data: they first rely on amplitude information (which captures pairwise correlations between pixels) before moving on to phase information (which encodes edges and higher-order correlations). This behavior was experimentally verified on simple image classification tasks.

More strikingly, the paper rigorously proves that for isotropic, high-dimensional inputs, online SGD requires at least N^3 steps to even distinguish phase-structured inputs from noise—a genuine hardness result. Yet the key insight is that natural images break isotropy with power-law spectra, and those spectra dramatically accelerate phase learning, even when they don't help classification directly. Simulations with two-layer networks on textures and deep CNNs on ImageNet and CIFAR-100 confirm that this non-trivial amplitude-phase interaction is central to how deep networks efficiently learn natural image distributions, providing a mechanistic explanation for their impressive real-world performance.

Key Points

Neural networks exhibit a strong simplicity bias: they first learn amplitude features (pairwise pixel correlations) before phase features (edges, higher-order correlations).
For isotropic high-dimensional inputs, online SGD requires at least N^3 steps to learn phase information; power-law spectra can dramatically accelerate this process.
Experiments on two-layer networks and deep CNNs (ImageNet, CIFAR-100) confirm that natural image power-law spectra enable efficient phase learning, explaining deep network performance on real-world images.

Why It Matters

Explains why deep networks excel on natural images: power-law spectra accelerate learning of critical phase features like edges.

Read Original Article

Fourier analysis reveals why neural networks learn phase information so slowly

Why It Matters

Related Articles

Stay Ahead in AI