Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound
Deep learning model measures throat muscle thickness during speech, achieving near-human accuracy and revealing vowel-specific patterns.
A research team from multiple institutions has developed SMMA (Speech Muscle Measurement Automation), a novel AI framework that automates the measurement of throat muscle dynamics during speech production. By combining deep-learning segmentation with skeleton-based thickness quantification on ultrasound data, SMMA addresses the critical bottleneck of manual measurement, which has historically limited large-scale studies in speech motor control. The system was validated against expert annotations, demonstrating it can perform this specialized biomechanical analysis with near-human accuracy, paving the way for unprecedented scale in phonetic and clinical research.
The technical validation showed SMMA achieves a Dice similarity coefficient of 0.9037 and a mean absolute error of just 0.53 mm. When applied to analyze Cantonese vowel production in 11 subjects, the AI revealed systematic, quantifiable patterns: muscle thickness during /a:/ production (7.29 mm) was significantly greater than during /i:/ (5.95 mm), with a large effect size (Cohen's d > 1.3). It also detected expected 5-8% anatomical scaling differences between sexes. This objective, automated measurement capability transforms research, enabling large-scale investigations into speech physiology and providing a tool for the objective assessment and monitoring of speech and swallowing disorders like dysarthria.
- SMMA AI framework automates ultrasound-based measurement of geniohyoid muscle thickness with expert-level accuracy (Dice = 0.9037, MAE = 0.53 mm).
- Application revealed the vowel /a:/ requires 22% greater muscle thickness (7.29 mm) than /i:/ (5.95 mm), quantifying activation patterns.
- Eliminates time-consuming manual annotation, enabling scalable research for speech motor control and objective clinical assessment of disorders.
Why It Matters
Provides an objective, scalable tool for researching speech production and clinically assessing disorders, moving beyond subjective evaluations.