DDE Copula: An Interpretable Deep Generative Model with Identifiable Latent Structures
New DDE Copula model tackles black-box complexity with provable identifiability and adaptive layer widths.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
Deep generative models are powerful but often act as black boxes, making them difficult to interpret and their parameters unidentifiable. A new paper from Joseph Feldman and Yuqi Gu proposes the Deep Discrete Encoder (DDE) Copula, which addresses both issues head-on. The model nests a hierarchical directed network of binary latent variables inside a copula framework, enabling flexible dependence modeling for mixed discrete and continuous data. By using rank likelihoods for estimation, it decouples marginal modeling from inference on the DDE parameters, avoiding the need to specify marginal distributions explicitly. The authors prove conditions for parameter identifiability and establish quotient-space posterior consistency under exact rank likelihoods for continuous margins, with concentration guarantees for tied/mixed margins via a generalized likelihood approach.
For computation, the DDE Copula uses a stochastic expectation-maximization algorithm for maximum a posteriori estimation, paired with initialization strategies to improve convergence. A key innovation is the use of Bayesian rank-selection priors to adaptively learn the number of latent nodes per layer, automatically determining network complexity without manual tuning. Simulations demonstrate strong finite-sample performance, outperforming standard copula models on synthetic data. In a real-world application to personality survey data, the DDE Copula recovers interpretable hierarchical latent structures—such as groupings of traits like extraversion and openness—that align with psychological theory. This work bridges the gap between deep generative modeling and statistical interpretability, offering a principled tool for dependence analysis in high-dimensional data.
- DDE Copula uses a hierarchical network of binary latent variables within a copula framework for flexible dependence modeling.
- Estimation via rank likelihoods avoids specifying marginals; identifiability and posterior consistency are proven under mild conditions.
- Bayesian rank-selection priors automatically adapt layer widths, eliminating manual tuning of network depth and width.
Why It Matters
Makes deep generative models interpretable and identifiable for multivariate analysis in psychology, finance, and bioinformatics.