Targeted Linguistic Analysis of Sign Language Models with Minimal Translation Pairs
Researchers find state-of-the-art ASL-to-English models miss crucial non-manual signals...
A new benchmark dataset, ASL-MTP, uses minimal translation pairs to analyze how well sign language models capture linguistic phenomena. Testing a state-of-the-art ASL-to-English model, the authors found it performs above chance on most phenomena but relies heavily on manual cues while often missing crucial non-manual cues like facial expressions and upper body movements. The dataset covers multiple types of sign language phenomena, exposing a critical blind spot in current sign language AI.
- ASL-MTP includes 12 types of linguistic phenomena with minimal translation pairs for systematic evaluation
- State-of-the-art translation model relies heavily on manual cues, often missing critical non-manual signals like facial expressions and head movements
- Findings suggest current benchmarks may overstate model capabilities by not testing linguistic nuance
Why It Matters
Better sign language AI must grasp both hands and face — this benchmark exposes a critical blind spot in current models.