EMSDialog: Synthetic Multi-person Emergency Medical Service Dialogue Generation from Electronic Patient Care Reports via Multi-LLM Agents
A new AI pipeline generates realistic multi-person emergency medical dialogues to train diagnostic AI.
A research team from the University of Virginia and Georgia Tech has introduced EMSDialog, a novel synthetic dataset designed to address a critical gap in medical AI training. Existing medical dialogue datasets are largely one-on-one conversations, lacking the complex, multi-party workflow of real emergency medical services (EMS) where paramedics, dispatchers, and patients interact. To solve this, the team created an automated pipeline that uses multiple LLM agents to generate realistic conversations. The system is grounded in real-world electronic Patient Care Reports (ePCRs) and follows a topic-flow-based generation process, where agents iteratively plan dialogue, generate turns, and perform self-refinement with rule-based checks for factual accuracy and conversational flow.
The result is EMSDialog, a high-quality corpus of 4,414 synthetic multi-speaker EMS conversations. Each dialogue is annotated with 43 possible diagnoses, speaker roles (e.g., paramedic, patient), and turn-level topics, providing rich training signals. Both human evaluators and LLM-based metrics confirmed the dataset's realism and coherence. Crucially, the research demonstrates that models trained with EMSDialog show significant improvements in conversational diagnosis prediction. These models achieve higher accuracy, make diagnoses more promptly within the conversation flow, and exhibit greater stability compared to models trained on existing, less representative datasets. This work, accepted at ACL Findings 2026, provides a scalable method for creating specialized training data where real-world examples are scarce or privacy-sensitive.
- Generates 4,414 synthetic multi-speaker EMS dialogues from real electronic patient records using a multi-LLM agent pipeline.
- Annotated with 43 distinct diagnoses and turn-level topics, providing structured data for training diagnostic AI.
- Proven to improve AI model performance, boosting diagnosis prediction accuracy and timeliness in simulated EMS conversations.
Why It Matters
Provides crucial, realistic training data to build AI assistants that can better support complex, multi-person emergency medical diagnostics.