Research & Papers

Feature repulsion and spectral lock-in reveal activation-dependent grokking dynamics

New paper tests Tian's repulsion theorem—square activation yields detectable spectral signatures, ReLU doesn't.

Deep Dive

Yongzhong Xu's new paper empirically investigates the phenomenon of grokking—delayed generalization after memorization—in two-layer neural networks, building directly on Tian (2025)'s theoretical repulsion theorem. The theorem posits that during interactive feature learning, similar features repel each other via negative off-diagonal entries in matrix B = (F̃ᵀF̃ + ηI)⁻¹. But until now it was unclear when this mechanism becomes observable in practice or whether it leaves a measurable spectral signature in parameter updates. Xu tests this on the modular addition benchmark (M=71, K=2048, MSE loss) used by Tian.

The results reveal a striking structure-mechanism dissociation: the predicted sign rule holds robustly (empirical sign-match rising from 0.865 to 0.985 for σ=x² and saturating at 1.000 for σ=ReLU on top-200 most-similar feature pairs), but the spectral signature in weight updates is strongly activation-dependent. With σ=x², a simple slope detector on the rolling eigengap σ₂/σ₃ of ΔW fires in 15/15 grokking seeds at epoch 174 (IQR [173,174]) and in 0/15 non-grokking controls, with 229× late-stage magnitude separation—the spectrum becomes rank-2. In contrast, with σ=ReLU, the detector never fires and the spectrum remains effectively rank-1. This dissociation aligns with Tian's Theorem 5 distinction between focused (power-law) and spreading (ReLU) memorization: while the sign structure of B depends only on F̃ᵀF̃, how feature repulsion translates into weight updates critically depends on the activation derivative σ'.

Key Points
  • Sign rule for feature repulsion holds across activations (match rate 0.865→1.000) on top-200 similar pairs
  • With σ=x², a simple eigengap detector (σ₂/σ₃) identifies grokking in 15/15 seeds at epoch 174 (229× separation)
  • With σ=ReLU, no spectral lock-in detected; spectrum remains rank-1, consistent with spreading memorization

Why It Matters

Links theoretical repulsion to observable spectral signatures—critical for understanding when and why neural networks suddenly generalize.