Trust, Safety, and Accuracy: Assessing LLMs for Routine Maternity Advice
Research shows AI models like ChatGPT-4o and Perplexity could help 400M+ rural women access pregnancy information.
A new study from Indian researchers evaluates the potential of large language models (LLMs) to deliver reliable maternity advice in rural India, where access to medical professionals is limited. With over 830 million internet users and nearly half of rural women now online, digital tools present a critical opportunity. The team tested three leading models—ChatGPT-4o, Perplexity AI, and GeminiAI—against responses from maternal health professionals using 17 pregnancy-focused questions.
Results showed a clear division of strengths. Perplexity AI's responses most closely matched the semantic meaning of expert answers, indicating strong factual alignment. However, OpenAI's ChatGPT-4o excelled in producing clearer, more understandable text with superior use of medical terminology, a key factor for effective patient education. The evaluation used metrics like semantic similarity, noun overlap, and readability scores to measure content quality.
The study, published on arXiv, highlights the urgent need for AI tools that balance accuracy with clarity to improve healthcare communication. As internet penetration deepens in underserved regions, LLMs could act as scalable, first-line educational aids, helping to bridge the information gap for millions. The findings underscore that model performance varies significantly by metric, suggesting future tools may need to combine retrieval-augmented generation (RAG) for accuracy with advanced natural language generation for clarity.
- ChatGPT-4o produced the clearest, most understandable text with better medical terminology for patients.
- Perplexity AI most closely matched the semantic meaning and factual content of expert-provided answers.
- The study tested models on 17 pregnancy questions to address healthcare gaps for nearly 400 million rural women online.
Why It Matters
Demonstrates AI's potential as a scalable, first-line health education tool in regions with severe doctor shortages.