Bridging the Reproducibility Divide: Open Source Software's Role in Standardizing Healthcare AI
A new analysis reveals 74% of healthcare AI papers use private data or don't share code, creating a reproducibility crisis.
A new analysis by researchers John Wu, Zhenbang Wu, and Jimeng Sun, published on arXiv, exposes a critical reproducibility crisis in AI for Healthcare (AI4H). The study, titled "Bridging the Reproducibility Divide: Open Source Software's Role in Standardizing Healthcare AI," reveals that despite growing awareness, 74% of recent AI4H publications still rely on private datasets or fail to share their modeling code. This lack of transparency is especially problematic in healthcare, where trust and clinical validation are paramount. The authors argue that inconsistent and poorly documented data preprocessing pipelines lead to variable performance reports, making it impossible to fairly evaluate or build upon proposed AI models, ultimately hindering progress in the field.
The paper provides a powerful incentive for change: their data shows that AI4H research which utilizes both public datasets and shares its code receives, on average, 110% more citations than work that does neither—more than doubling the academic impact. To address the crisis, the authors advocate for the AI4H community to adopt concrete open science practices, establish standardized guidelines for data preprocessing, and develop robust benchmarks. They conclude that tackling these challenges through open-source development is essential for creating AI models that are safe, effective, and truly beneficial for patient care, paving the way for more trustworthy integration of AI into clinical settings.
- 74% of analyzed AI for Healthcare (AI4H) papers use private data or don't share code, creating a reproducibility crisis.
- AI4H papers that use public data AND share open-source code receive 110% more citations on average.
- The authors call for standardized data preprocessing and community-wide open-source practices to build trustworthy medical AI.
Why It Matters
For AI to be safely deployed in medicine, research must be reproducible; this study provides a clear roadmap and incentive.