Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks
New theoretical framework uses convex conjugate duality to prove why SGD finds global optima and how data sets fundamental limits.
Researcher Binchuan Qi proposes Conjugate Learning Theory, a new theoretical framework for deep learning. It proves training deep neural networks with mini-batch SGD achieves global optima by controlling structure matrix eigenvalues and gradient energy. The theory establishes a model-agnostic lower bound for empirical risk, showing data determines trainability limits, and provides deterministic bounds on generalization error that quantify the impact of information loss, maximum loss, and feature entropy.
Why It Matters
Provides a unified theoretical foundation to understand and improve training stability and model performance across architectures.