SAHMM-VAE: A Source-Wise Adaptive Hidden Markov Prior Variational Autoencoder for Unsupervised Blind Source Separation
New VAE framework embeds source separation directly into variational learning, eliminating post-processing steps.
Researcher Yuan-Hao Wei has introduced SAHMM-VAE, a novel framework for unsupervised blind source separation that fundamentally rethinks how variational autoencoders (VAEs) handle mixed signals. The core innovation is replacing the standard single latent prior with source-wise adaptive Hidden Markov Model (HMM) priors. Each latent dimension gets its own regime-switching prior, allowing different dimensions to be pulled toward distinct source-specific temporal organizations during training. This means source separation isn't an external post-processing step but is embedded directly into the variational learning objective itself.
The framework jointly optimizes the encoder, decoder, posterior parameters, and source-wise prior parameters. The encoder learns to invert the mixing transformation, while the decoder acts as the generative mixing model. Through this coupled optimization, the alignment between posterior source trajectories and heterogeneous HMM priors becomes the mechanism for separation. Wei instantiated the idea with three model branches: a Gaussian-emission HMM prior, a Markov-switching autoregressive HMM prior, and an HMM state-flow prior with state-wise autoregressive flow transformations.
Experiments demonstrate that SAHMM-VAE can achieve unsupervised source recovery while also learning meaningful source-wise switching structures. This extends the structured-prior VAE lineage into adaptive switching priors and provides a foundation for more interpretable and potentially identifiable latent source modeling. The approach could be applied to separate overlapping audio streams, disentangle mixed biological signals, or isolate components in complex sensor data, all without requiring supervised training data.
- Replaces single VAE latent prior with source-wise adaptive HMM priors for each dimension
- Embeds source separation directly into variational learning, eliminating post-processing
- Demonstrates three model branches: Gaussian HMM, Markov-switching AR-HMM, and HMM state-flow prior
Why It Matters
Enables separation of mixed signals (audio, bio-data) without labeled examples, advancing interpretable AI.