Near codimension-1 bifurcations, the state-space NTK reduces to a rank-one operator, dramatically simplifying training dynamics?

Near codimension-1 bifurcations, the state-space NTK reduces to a rank-one operator, dramatically simplifying training dynamics.

In student-teacher RNN experiments, the effective rank of sNTK collapsed as the network learned its first bifurcation?

In student-teacher RNN experiments, the effective rank of sNTK collapsed as the network learned its first bifurcation.

Low-rank natural gradient methods stabilize training near bifurcations with minimal overhead vs. SGD?

Low-rank natural gradient methods stabilize training near bifurcations with minimal overhead vs. SGD.

Research & Papers

Bifurcations collapse neural tangent kernel to rank-one, simplifying RNN training

arXiv q-bio.NC May 14, 2026

⚡New theory reveals training dynamics simplify drastically near critical transitions in RNNs.

Deep Dive

A new paper from James Hazelden and Eric Shea-Brown (UW) develops a local theory of gradient descent near bifurcations — qualitative changes in recurrent network dynamics. They introduce the empirical state-space neural tangent kernel (sNTK) and show that as a network approaches a bifurcation, the sNTK collapses to a rank-one operator. This collapse dominates the learning landscape: gradient descent is funneled into a few critical dynamical directions, making the loss geometry predictable from classical normal form theory. In a student-teacher RNN experiment, the first learned bifurcation coincided with a sharp drop in sNTK effective rank, and the dominant parameter direction matched the scalar pitchfork normal form.

The authors also demonstrate that low-rank natural gradient methods can resolve the learning instability that arises near bifurcations with very little computational overhead over standard SGD. This provides a principled way to stabilize training of recurrent models that must learn to pass through phase transitions (e.g., in time-series forecasting, motor control, or neural dynamics). The work bridges dynamical systems and deep learning theory, offering a tractable mathematical framework for understanding feature learning in time-dependent tasks.

Key Points

Near codimension-1 bifurcations, the state-space NTK reduces to a rank-one operator, dramatically simplifying training dynamics.
In student-teacher RNN experiments, the effective rank of sNTK collapsed as the network learned its first bifurcation.
Low-rank natural gradient methods stabilize training near bifurcations with minimal overhead vs. SGD.

Why It Matters

Provides a principled theory for training RNNs through critical phase transitions, improving stability and interpretability.

Read Original Article

Bifurcations collapse neural tangent kernel to rank-one, simplifying RNN training

Why It Matters

Related Articles

Stay Ahead in AI