LMPAN: 480K-parameter AI model matches DeepVQE-S for echo cancellation
New lightweight model delivers real-time full-duplex acoustic echo cancellation with just 480K parameters.
LMPAN (Lightweight Multi-Path Alignment Network) addresses the challenge of joint full-duplex acoustic echo cancellation and noise suppression in on-device spoken dialogue systems. Built by Chengwei Liu and colleagues, the model tackles hardware-induced distortions and dynamic acoustic environments with three core innovations: a multi-path alignment stage that corrects temporal and energy mismatches across reference, linear AEC output, and microphone signals; an attention-based mechanism that dynamically integrates enhanced features; and a post-filtering module with dynamic target generation for downstream tasks like ASR and VAD. The model employs a two-stage training framework that leverages self-supervised learning representations to improve perceptual quality, ensuring natural-sounding output even in challenging conditions.
Remarkably, LMPAN achieves performance comparable to the state-of-the-art lightweight model DeepVQE-S while using only 480K parameters and 126 MACs (multiply-accumulate operations). This extreme efficiency enables real-time inference on resource-constrained devices. Accepted at Interspeech 2026, the paper demonstrates that full-duplex speech processing can be both lightweight and effective, making it ideal for smart speakers, hearing aids, and voice assistants. The work paves the way for more natural hands-free communication by simultaneously canceling echo and suppressing noise without sacrificing responsiveness.
- Only 480K parameters and 126 MACs, matching DeepVQE-S performance.
- Innovative multi-path alignment stage corrects temporal and energy mismatches in three signals.
- Two-stage training with self-supervised learning boosts perceptual quality.
Why It Matters
Enables real-time echo cancellation and noise suppression on low-power devices, improving voice assistants and smart speakers.