Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials
New math solves undefined gradients in state-space models, enabling stable AI parameter learning.
Researcher Adrien Corenflos has published a technical paper titled 'Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials' that solves a critical mathematical bug affecting gradient computation in state-space models. Square-root Kalman filters are widely used for numerical stability in systems like robotics navigation and financial forecasting, but their core triangularization operation (via QR decomposition) creates two problems when differentiated: non-unique semi-orthogonal factors yield undefined gradients, and standard Jacobian formulas involve matrix inverses that diverge with rank-deficient inputs.
Corenflos' breakthrough comes from observing that all filter outputs relevant to parameter learning depend only on the Gramian matrix (MM⊤), which remains smooth even when the triangularization isn't. He derives a closed-form chain rule directly from this Gramian identity and proves it exact for key learning objectives like the Kalman log-marginal likelihood. The solution handles rank-deficient cases through a two-component decomposition: a column-space term using the Moore-Penrose pseudoinverse and a null-space correction for perturbations outside M's column space.
This 4-page paper documents the mathematics behind a practical bug fix that enables reliable automatic differentiation for square-root Kalman implementations. The approach eliminates numerical instability that previously hampered gradient-based parameter optimization in state-space models, making it possible to train these systems more effectively using modern machine learning frameworks. The work bridges control theory, signal processing, and machine learning by providing robust differentiation tools for time-series models.
- Solves undefined gradients in square-root Kalman filters by deriving chain rule from Gramian identity
- Handles rank-deficient matrices via Moore-Penrose pseudoinverse and null-space correction
- Enables stable gradient-based parameter learning for state-space models used in forecasting and control
Why It Matters
Enables reliable AI training for time-series models in robotics, finance, and signal processing without numerical instability.