Ultrasound-based detection and malignancy prediction of breast lesions eligible for biopsy: A multi-center clinical-scenario study using nomograms, large language models, and radiologist evaluation
A new AI nomogram achieved 83.8% accuracy in malignancy prediction, outperforming expert radiologists and two ChatGPT models.
A new AI-powered diagnostic tool has demonstrated superior performance in interpreting breast ultrasound scans, outperforming both expert radiologists and general-purpose large language models like ChatGPT. In a multi-center study published in Academic Radiology (2026), researchers from Iran and Turkey developed and validated a fused nomogram—a statistical model—that combines standard BIRADS features with 26 quantitative morphometric characteristics extracted from ultrasound images. The model was trained and tested on a dataset of 1,747 women with pathologically confirmed breast lesions. In pooled analysis, this integrated AI nomogram achieved the highest accuracy for both biopsy recommendation (83.0%) and malignancy prediction (83.8%), with corresponding AUCs of 0.901 and 0.853.
The study's key finding is that this specialized, interpretable AI model consistently outperformed not only a morphometric-only nomogram and a BIRADS-only nomogram but also three human radiologists (one senior, two general) and two variants of OpenAI's ChatGPT. The researchers had the LLMs and radiologists independently interpret de-identified breast lesion images. The AI nomogram's performance remained robust across two external validation cohorts, confirming its generalizability across different ultrasound platforms and patient populations. This suggests that purpose-built, clinical AI can surpass both human experts and powerful but general LLMs in specific, high-stakes diagnostic tasks where precision is critical.
- The fused AI nomogram achieved 83.8% accuracy in predicting malignancy and 83.0% accuracy in recommending biopsies, outperforming all comparators.
- The model was tested on a large, multi-center dataset of 1,747 patients and validated across different populations and ultrasound systems.
- It significantly outperformed three radiologists and two ChatGPT models, showing specialized clinical AI can beat both human experts and general LLMs.
Why It Matters
This tool could reduce unnecessary, invasive breast biopsies by providing more accurate, data-driven recommendations, improving patient care and resource allocation.