Research & Papers

BGM-IV: A Bayesian generative model for nonlinear causal inference

A new latent Bayesian method outperforms alternatives in high-dimensional causal analysis.

Deep Dive

Instrumental variable (IV) regression is essential for causal estimation under endogeneity, but modern problems involve nonlinear effects and many covariates. Existing methods often struggle with high-dimensional settings. BGM-IV, proposed by Guyue Luo and Qiao Liu, takes a different approach: it models the entire generative process in a causally structured latent space. It infers distinct latent components that capture shared confounding, outcome-specific variation, treatment-specific variation, and covariate-only nuisance. To handle endogeneity, it replaces the standard confounded outcome likelihood with a pseudo-likelihood that averages over instrument-induced treatment values.

On benchmarks, BGM-IV matches existing methods in low-dimensional scenarios and achieves the best performance in high-dimensional covariate regimes. The results demonstrate that structured latent generative modeling offers a principled and effective strategy for nonlinear IV estimation with rich covariates. The code is open-sourced, enabling adoption in fields like econometrics, epidemiology, and any domain requiring robust causal inference from complex observational data.

Key Points
  • Reframes nonlinear IV regression as posterior inference in a causally structured latent space.
  • Separates latent components into confounding, outcome, treatment, and covariate nuisance to better model endogeneity.
  • Achieves state-of-the-art performance in high-dimensional covariate regimes across multiple benchmarks.

Why It Matters

Enables more accurate causal estimates in high-dimensional, nonlinear settings, advancing econometrics and AI-driven decision-making.