Research & Papers

SIDEKICK: A Semantically Integrated Resource for Drug Effects, Indications, and Contraindications

Researchers built a massive drug safety knowledge graph using LLMs and Graph RAG, outperforming existing databases.

Deep Dive

A team of researchers has published a new AI-powered resource called SIDEKICK, a semantically integrated knowledge graph designed to revolutionize how drug safety data is structured and used. The system addresses a critical gap in pharmacovigilance and clinical decision support by moving beyond limited terminologies like MedDRA. The core innovation is a workflow that uses Large Language Models (LLMs) for extraction and Graph-Retrieval Augmented Generation (Graph RAG) for ontology mapping, processing over 50,000 FDA Structured Product Labels. Key technical details include mapping extracted terms to three major ontologies: the Human Phenotype Ontology (HPO) for symptoms, the MONDO Disease Ontology for conditions, and RxNorm for drugs. The final dataset is serialized as a Resource Description Framework (RDF) graph using the Semanticscience Integrated Ontology (SIO) as an upper-level ontology to maximize interoperability with other Semantic Web tools. In benchmark tests, SIDEKICK outperformed established databases like SIDER and ONSIDES in the specific task of drug repurposing by side effect similarity. This demonstrates its practical utility for computational drug discovery. The implications are significant for both automated safety surveillance systems and AI-driven research, providing a richer, more connected dataset that enables complex semantic reasoning and phenotype-based analysis that was previously limited by disparate data sources.

Key Points
  • Built using LLM extraction & Graph RAG to process 50,000+ FDA drug labels
  • Maps terms to HPO, MONDO, and RxNorm ontologies for semantic interoperability
  • Outperforms SIDER and ONSIDES databases in drug repurposing by side effect similarity

Why It Matters

Provides a unified, AI-friendly dataset to accelerate drug safety monitoring and computational discovery of new treatments.