Sima AIunty: Caste Audit in LLM-Driven Matchmaking
Major AI models like GPT and Llama show up to 25% bias favoring same-caste matches.
A team of researchers, including Atharva Naik, Shounok Kar, and Koustuv Saha, has published a groundbreaking audit study titled 'Sima AIunty: Caste Audit in LLM-Driven Matchmaking.' The work systematically investigates how five major large language model (LLM) families—OpenAI's GPT, Google's Gemini, Meta's Llama, Alibaba's Qwen, and India's BharatGPT—evaluate matrimonial profiles when prompted to assess social acceptance, marital stability, and cultural compatibility. By varying caste identities (Brahmin, Kshatriya, Vaishya, Shudra, Dalit) and income levels across controlled profiles, the study provides a stark, data-driven look at algorithmic bias.
The analysis revealed a consistent and troubling pattern: all tested models reproduced existing caste hierarchies. Same-caste matches received the most favorable ratings, with scores up to 25% higher on a 10-point scale compared to inter-caste matches. Furthermore, the models' evaluations of inter-caste matches themselves followed the traditional caste hierarchy, systematically disadvantaging lower-caste profiles. This demonstrates that LLMs, trained on vast corpora of human-generated text, can absorb and amplify deep-seated social biases.
This research is significant because it moves beyond Western-centric bias audits (often focused on race or gender) to examine a culturally specific form of stratification critical to South Asian societies. The findings underscore that AI systems deployed in socially sensitive domains like matchmaking are not neutral arbiters but active participants that can reinforce historical forms of exclusion. The paper calls for culturally grounded evaluation frameworks and intervention strategies to prevent AI from cementing discriminatory social norms.
- Tested 5 LLM families (GPT, Gemini, Llama, Qwen, BharatGPT) on real matrimonial profiles, finding consistent caste-based bias.
- Same-caste matches were rated up to 25% higher than inter-caste matches, with ratings following traditional hierarchy.
- Highlights the risk of AI systems reinforcing harmful social stratification in sensitive domains without proper cultural auditing.
Why It Matters
Shows AI can automate and scale historical discrimination, demanding new, culturally-aware bias mitigation for global deployment.