Audio & Speech

Recurrence-Based Nonlinear Vocal Dynamics as Digital Biomarkers for Depression Detection from Conversational Speech

Depression alters how your voice revisits acoustic states over time...

Deep Dive

A new study from Himadri S Samanta, published on arXiv, proposes recurrence-based nonlinear vocal dynamics as digital biomarkers for depression detection. Using the DAIC-WOZ corpus with 142 labeled participants, the researchers modeled frame-level COVAREP trajectories as nonlinear dynamical systems, extracting recurrence-based biomarkers from 74 vocal channels. Logistic regression with feature selection and stratified cross-validation achieved a mean AUC of 0.689, outperforming static acoustic baselines, entropy-dynamics features, Hurst exponent features, determinism features, and Lyapunov-like instability proxies. Permutation testing confirmed statistical significance at p=0.004, with a pooled cross-validated AUC of 0.665 (95% bootstrap CI: [0.568, 0.758]).

This approach moves beyond conventional static acoustic descriptors by capturing the temporal organization of vocal state trajectories. The key insight is that depression alters how the vocal system revisits acoustic states over time—a subtle but measurable signal. While the AUC of 0.689 is modest, it significantly outperforms existing methods and suggests that nonlinear state-space analysis could complement traditional diagnostic tools for mental health. The work supports the development of scalable, non-invasive digital screening technologies that could be deployed via telehealth or mobile apps, potentially improving early detection and monitoring of depression in clinical and remote settings.

Key Points
  • Recurrence-based biomarkers from 74 vocal channels achieved AUC 0.689 on 142 participants from DAIC-WOZ corpus
  • Outperformed static acoustics, entropy, Hurst exponent, and Lyapunov proxies with statistical significance at p=0.004
  • Modeled vocal trajectories as nonlinear dynamical systems, capturing temporal recurrence structure missed by conventional methods

Why It Matters

Enables scalable, non-invasive depression screening via voice analysis, potentially improving early detection in telehealth and clinical settings.