RRP-Voice Dataset Tracks Rare Laryngeal Disease with 10-Year Voice Recordings
First longitudinal voice dataset for RRP detection tracks patients for up to a decade.
Researchers from multiple institutions have released RRP-Voice, the first longitudinal dataset for detecting Recurrent Respiratory Papillomatosis (RRP), an HPV-induced laryngeal disease. The dataset comprises voice recordings from 26 patients tracked over up to ten years, pairing sustained vowels with sentence-level utterances. Each session is annotated by otolaryngologists and confirmed synchronously with laryngoscopy, ensuring accurate labeling of disease recurrence and post-surgical remission states.
The accompanying benchmark evaluates multiple approaches: handcrafted features, end-to-end deep networks, self-supervised pretrained models (e.g., wav2vec 2.0, HuBERT), and recent audio large language models. Under session-level cross-validation with patient-level audit, results demonstrate that the discriminative signal reflects laryngoscopic disease state rather than stable speaker identity. This work lays a foundation for rare longitudinal pathological voice tasks in low-resource clinical settings, enabling continuous voice monitoring for RRP patients.
- First longitudinal voice dataset for RRP, covering 26 patients across up to 10 years of follow-up
- Dataset pairs sustained vowels with sentence-level speech, annotated via laryngoscopy-confirmed disease state
- Benchmark evaluates 4 model classes: handcrafted features, deep nets, self-supervised, and audio LLMs
Why It Matters
Enables continuous voice monitoring for RRP, potentially reducing invasive laryngoscopy frequency for patients.