Research & Papers

Meenz bleibt Meenz, but Large Language Models Do Not Speak Its Dialect

State-of-the-art AI models struggle with German dialects, scoring below 10% on translation tasks.

Deep Dive

Researchers Minh Duc Bui, Manuel Mager, Peter Herbert Kann, and Katharina von der Wense published the first NLP study on Meenzerisch, a dying German dialect from Mainz. They created a dataset of 2,351 dialect words and tested LLMs on definition and word generation. The best model achieved only 6.27% accuracy for definitions and 1.51% for word generation, showing current AI's limitations with low-resource languages.

Why It Matters

Highlights a critical blind spot in AI's linguistic capabilities and the urgent need for dialect preservation efforts.