Research & Papers

An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models

New LLM-based system extracts and ranks disease symptoms with 0.70 accuracy, beating previous state-of-the-art models.

Deep Dive

A multi-institutional research team from Vanderbilt University Medical Center and the Undiagnosed Diseases Network has developed RARE-PHENIX, a comprehensive AI framework that automates the challenging process of identifying rare diseases from unstructured clinical notes. Unlike previous approaches that focused on individual components, RARE-PHENIX models the complete clinical workflow—extracting phenotypic features from text, standardizing them to Human Phenotype Ontology (HPO) terms, and ranking diagnostically informative phenotypes. The system was trained on data from 2,671 patients across 11 clinical sites and validated on 16,357 real-world clinical notes, demonstrating significant improvements over existing methods.

The framework's three-stage architecture—LLM-based extraction, ontology-grounded standardization, and supervised ranking—achieved an ontology-based similarity score of 0.70 compared to PhenoBERT's 0.58, representing a substantial advancement in clinical AI. Each module contributed to performance gains, with ablation studies confirming the value of modeling the full phenotyping workflow rather than treating it as a single extraction task. By producing structured, ranked phenotypes that align closely with clinician curation, RARE-PHENIX enables human-in-the-loop rare disease diagnosis at scale, potentially reducing diagnostic delays that currently average 4-8 years for rare disease patients.

Key Points
  • RARE-PHENIX achieved 0.70 ontology-based similarity score vs PhenoBERT's 0.58 on 16,357 clinical notes
  • Trained on data from 2,671 patients across 11 Undiagnosed Diseases Network clinical sites
  • Three-stage architecture: LLM extraction, HPO standardization, and supervised ranking of phenotypes

Why It Matters

Could dramatically reduce rare disease diagnostic delays from years to weeks by automating phenotype extraction from clinical notes.