BanglaSummEval: Reference-Free Factual Consistency Evaluation for Bangla Summarization
New framework uses a single LLM to evaluate factual consistency in Bangla with 0.763 correlation to human experts.
Researchers led by Ahmed Rafid introduce BanglaSummEval, a reference-free framework for evaluating factual consistency in Bangla text summarization. It uses a single multilingual instruction-tuned language model to generate and answer questions from source documents and summaries, comparing answers with BERTScore-Recall. Validated on 300 human-written summaries, it achieves strong correlation (Spearman's ρ=0.763) with expert judgments. This provides a practical, interpretable tool for assessing AI-generated content in critical domains like healthcare and education for a major under-resourced language.
Why It Matters
Enables reliable, automated fact-checking for AI-generated Bangla content in high-stakes fields, addressing a critical gap for 300 million speakers.