A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews
Classical ML models like Random Forest outperformed transformer models in analyzing 5,652 mobile banking app reviews.
A research team from Bangladesh has published a comprehensive study analyzing sentiment in English and Bangla reviews for government mobile banking apps, revealing significant insights about digital financial services in developing economies. The researchers collected and filtered 11,414 raw Google Play reviews down to 5,652 usable entries across four Bangladeshi banking applications. Using a hybrid labeling approach that combined star ratings with an independent XLM-RoBERTa classifier, they achieved moderate inter-method agreement (kappa = 0.459). Surprisingly, their analysis showed that traditional machine learning models consistently outperformed more modern transformer-based approaches.
Random Forest emerged as the top performer with 81.5% accuracy, while Linear SVM achieved the highest weighted F1 score of 0.804. Both significantly outperformed the off-the-shelf XLM-RoBERTa model, with McNemar's test confirming statistical significance (p < 0.05). The most striking finding was a 16.1-percentage-point accuracy gap between Bangla and English text processing, highlighting the urgent need for better low-resource language model development. Using DeBERTa-v3 for aspect-level analysis, the team identified transaction speed and poor interface design as the primary sources of user dissatisfaction, with the eJanata app receiving the worst ratings.
The study concludes with three concrete policy recommendations: remediation of app quality issues, trust-centered release management practices, and Bangla-first NLP adoption for state-owned banks. These findings provide a data-driven roadmap for improving digital financial inclusion in Bangladesh and similar markets where mobile banking serves as the primary gateway to financial services for millions of users.
- Random Forest achieved 81.5% accuracy, outperforming fine-tuned XLM-RoBERTa (79.3%) in sentiment classification
- 16.1-percentage-point accuracy gap between Bangla and English text highlights critical need for low-resource language AI development
- Aspect analysis revealed transaction speed and poor UI design as primary pain points across 5,652 banking app reviews
Why It Matters
Reveals critical gaps in AI for low-resource languages and provides data-driven insights for improving financial inclusion through better digital services.