From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference
New research establishes general convergence from BNNs to GPs and introduces a scalable MAP inference procedure using Nyström approximation.
Researchers Gracielle Antunes de Araújo and Flávio B. Gonçalves have published a significant theoretical paper establishing a general convergence framework connecting shallow Bayesian Neural Networks (BNNs) to Gaussian Processes (GPs). Their work relaxes previous assumptions in the field and compares alternative parameterizations of the limiting GP model, providing a more comprehensive mathematical foundation for understanding how BNNs behave as they scale. Building on this theory, they introduce a novel covariance function defined as a convex mixture of components induced by four widely used activation functions (ReLU, tanh, sigmoid, and others), characterizing key properties including positive definiteness and both strict and practical identifiability under different input designs.
For practical implementation, the researchers developed a scalable maximum a posteriori (MAP) training and prediction procedure using a Nyström approximation, demonstrating how the Nyström rank and anchor selection control the cost-accuracy trade-off. Their experiments on controlled simulations and real-world tabular datasets show stable hyperparameter estimates and competitive predictive performance at realistic computational costs, making the theoretical convergence results practically applicable. This work bridges theoretical machine learning with practical implementation concerns, offering both mathematical rigor and computational feasibility for researchers working at the intersection of Bayesian methods and neural networks.
- Establishes general convergence from shallow BNNs to GPs by relaxing prior assumptions
- Proposes new convex mixture covariance function combining four activation functions
- Develops scalable MAP inference using Nyström approximation with controlled cost-accuracy trade-off
Why It Matters
Provides theoretical foundation for BNN-GP connections while offering practical, scalable inference methods for real-world applications.