NIH-MPINet: A Large-Scale Feature-Rich Network Dataset for Mapping the Frontiers of Team Science
A new dataset tracks 30,127 PIs across 86,743 grants from 2006 to 2023.
Researchers from NIH and partner institutions have released NIH-MPINet, a comprehensive network dataset that maps collaboration among multiple Principal Investigators (multi-PIs) on NIH R01-equivalent grants from 2006 to 2023. The dataset, curated from NIH RePORTER and PubMed, comprises 30,127 PIs as nodes and 86,743 grants as edges, spanning 888 recipient organizations and supported by 40 NIH Institutes and Centers. Each node includes PI affiliation metadata, while each edge includes grant year, title, and abstract, enabling deep analysis of collaboration patterns and research themes.
The team constructed a PI collaboration network from this data and identified 19 distinct communities and 20 major research topics. Communities showed specialized thematic profiles, such as cardiovascular health, cancer immunotherapy, neuroscience, and microbiome research, with genetics and genomics broadly represented across communities. Temporal analysis revealed significant shifts: topics like healthcare and outcomes research, cognitive health, and Alzheimer's disease have grown in prominence, while molecular and cellular biology has declined. This high-fidelity resource is designed to advance statistical learning methods and network analysis for studying long-term biomedical collaboration.
- Dataset includes 30,127 PIs and 86,743 grants from 2006 to 2023, covering 888 organizations and 40 NIH Institutes
- Analysis identified 19 collaboration communities and 20 major research topics, including cardiovascular health and cancer immunotherapy
- Temporal trends show growth in healthcare, cognitive health, and Alzheimer's research, with a decline in molecular biology
Why It Matters
This dataset enables researchers to quantify and predict collaboration trends in biomedical science, informing funding and policy decisions.