First longitudinal voice dataset for RRP, covering 26 patients across up to 10 years of follow-up?

First longitudinal voice dataset for RRP, covering 26 patients across up to 10 years of follow-up

Dataset pairs sustained vowels with sentence-level speech, annotated via laryngoscopy-confirmed disease state?

Dataset pairs sustained vowels with sentence-level speech, annotated via laryngoscopy-confirmed disease state

Benchmark evaluates 4 model classes?

handcrafted features, deep nets, self-supervised, and audio LLMs

Audio & Speech

RRP-Voice Dataset Tracks Rare Laryngeal Disease with 10-Year Voice Recordings

arXiv eess.AS June 02, 2026

⚡First longitudinal voice dataset for RRP detection tracks patients for up to a decade.

Deep Dive

Researchers from multiple institutions have released RRP-Voice, the first longitudinal dataset for detecting Recurrent Respiratory Papillomatosis (RRP), an HPV-induced laryngeal disease. The dataset comprises voice recordings from 26 patients tracked over up to ten years, pairing sustained vowels with sentence-level utterances. Each session is annotated by otolaryngologists and confirmed synchronously with laryngoscopy, ensuring accurate labeling of disease recurrence and post-surgical remission states.

The accompanying benchmark evaluates multiple approaches: handcrafted features, end-to-end deep networks, self-supervised pretrained models (e.g., wav2vec 2.0, HuBERT), and recent audio large language models. Under session-level cross-validation with patient-level audit, results demonstrate that the discriminative signal reflects laryngoscopic disease state rather than stable speaker identity. This work lays a foundation for rare longitudinal pathological voice tasks in low-resource clinical settings, enabling continuous voice monitoring for RRP patients.

Key Points

First longitudinal voice dataset for RRP, covering 26 patients across up to 10 years of follow-up
Dataset pairs sustained vowels with sentence-level speech, annotated via laryngoscopy-confirmed disease state
Benchmark evaluates 4 model classes: handcrafted features, deep nets, self-supervised, and audio LLMs

Why It Matters

Enables continuous voice monitoring for RRP, potentially reducing invasive laryngoscopy frequency for patients.

Read Original Article

RRP-Voice Dataset Tracks Rare Laryngeal Disease with 10-Year Voice Recordings

Why It Matters

Related Articles

🚀 Stay Ahead in AI