Speech-preserving active noise control: a deep learning approach in reverberant environments
A new deep learning system tackles the classic ANC problem of accidentally silencing the speaker you want to hear.
Researcher Shuning Dai has introduced a novel deep learning approach to a classic audio engineering problem: Active Noise Control (ANC). Traditional ANC systems, based on FxLMS algorithms, struggle with non-linear, real-world acoustic environments and often cancel out desired speech along with the noise. Dai's proposed system tackles this by building an end-to-end control architecture centered on a Convolutional Recurrent Network (CRN). This design uses Long Short-Term Memory (LSTM) networks to model the temporal dynamics of sound and employs complex spectrum mapping to handle non-linear distortions, moving beyond the limitations of linear assumptions.
A key innovation is the inclusion of a specialized voice retention loss function. This guides the model to selectively suppress environmental noise while identifying and preserving the spectral characteristics of a target speaker's voice. To rigorously test the system in realistic conditions, the research used the Image Source Method (ISM) to create a high-fidelity acoustic simulation that includes challenging reverberation effects. Experimental results show the Deep ANC system achieves significantly better noise reduction than traditional methods, particularly for difficult, non-stationary noises like crowd babble. Critically, evaluations using standard metrics (PESQ for quality and STOI for intelligibility) confirm the system successfully maintains the clarity and naturalness of the preserved speech.
- Uses a Convolutional Recurrent Network (CRN) with LSTM layers for end-to-end control of complex acoustic signals.
- Introduces a specialized voice retention loss function to selectively preserve target speech while suppressing noise.
- Outperforms traditional FxLMS algorithms, especially on non-stationary noise, and maintains speech quality per PESQ/STOI metrics.
Why It Matters
This research could lead to smarter noise-cancelling headphones and conferencing systems that protect conversations in noisy places like cafes or open offices.