IQRA 2026: Interspeech Challenge on Automatic Assessment Pronunciation for Modern Standard Arabic (MSA)
The Interspeech challenge saw a major performance jump using new datasets and models like large audio-language models.
A consortium of researchers from institutions including the Qatar Computing Research Institute (QCRI), Carnegie Mellon University, and others has published the findings from the second IQRA Interspeech Challenge. The challenge focused on advancing Automatic Mispronunciation Detection and Diagnosis (MDD) for Modern Standard Arabic (MSA). A key innovation this year was the release of 'Iqra_Extra_IS26,' a new dataset comprising authentic human mispronounced speech, which complemented existing training resources and provided a more realistic benchmark for model evaluation.
Participants employed a wide array of cutting-edge techniques, ranging from Connectionist Temporal Classification (CTC)-based self-supervised learning models to sophisticated two-stage fine-tuning strategies and the application of large audio-language models. This methodological diversity, combined with the new dataset, drove a significant performance leap, with the best systems achieving a 0.28 point increase in F1-score compared to the first IQRA challenge. The results mark a notable advancement in the field, establishing a stronger, more data-rich foundation for future research and development of practical tools for Arabic language learning and speech technology.
- The challenge introduced 'Iqra_Extra_IS26,' a new dataset of authentic human mispronunciations for Modern Standard Arabic.
- Submitted systems used advanced methods like CTC-based models and large audio-language models, leading to a 0.28 F1-score improvement.
- The results signal growing maturity in Arabic MDD research, enabling better automated pronunciation tutors and speech tools.
Why It Matters
This progress enables more effective, scalable tools for teaching Arabic pronunciation to millions of learners worldwide.