Research & Papers

Disability-First AI Dataset Annotation: Co-designing Stuttered Speech Annotation Guidelines with People Who Stutter

New research reveals why current AI speech datasets fail people with disabilities.

Deep Dive

A new paper exposes critical flaws in how AI speech datasets are annotated for disabilities like stuttering. Researchers found datasets are often labeled by crowdworkers without lived experience, leading to inaccurate and inconsistent labels. Through co-design workshops with people who stutter (PWS), the team developed new annotation guidelines that integrate embodied knowledge. The work highlights a tension between the complexity of disability and the need for simple AI labels, advocating for a 'disability-first' approach across the AI pipeline.

Why It Matters

This directly impacts the accuracy of voice assistants and speech recognition tools for millions of users with speech disabilities.