Audio & Speech

Deep Learning Boosts DOA Accuracy in Binaural Hearing Aids by 14%

New CRNN model uses speaker count fusion to improve hearing in noisy rooms.

Deep Dive

Researchers from the University of Ottawa and GN Hearing have published a paper on arXiv (2509.21382) demonstrating a deep learning approach to improve direction-of-arrival (DOA) estimation in binaural hearing aids. Their convolutional recurrent neural network (CRNN) model, which leverages spectral phase differences and magnitude ratios between microphone signals, was enhanced by integrating source-count information through a technique called late fusion. This method, which uses the estimated number of active speakers (0, 1, or 2+) as an auxiliary feature, yielded up to 14% higher average F1-scores compared to the baseline CRNN in real-world binaural recordings.

The study also explored dual-task training for joint DOA estimation and source counting, but found it did not improve DOA performance, though it did benefit source-count prediction. The key insight is that using a ground-truth (oracle) source count significantly enhances standalone DOA estimation, particularly in noisy, multi-speaker environments common in everyday hearing aid use. This work, set to appear in IEEE ICASSP 2026, highlights the potential of fusing source-count information for more robust auditory scene analysis in assistive hearing devices.

Key Points
  • CRNN model uses spectral phase differences and magnitude ratios for DOA estimation
  • Late fusion of speaker count information improves F1-score by up to 14%
  • Dual-task training did not improve DOA performance, but enhanced source-count prediction

Why It Matters

Better direction-of-arrival estimation in hearing aids means clearer speech understanding in crowded, noisy environments.

📬 Get the top 10 AI stories daily