Loss Design and Architecture Selection for Long-Tailed Multi-Label Chest X-Ray Classification
A new approach using LDAM-DRW loss and ConvNeXt-Large architecture beats standard methods for detecting rare diseases in X-rays.
A new research paper presents a systematic approach to tackling one of medical AI's toughest problems: accurately detecting rare diseases in chest X-rays where common conditions dominate the data. Researcher Nikhileswara Rao Sulake's work, submitted to the CXR Long Tail Challenge at ISBI 2026, demonstrates how careful loss function design and architecture selection can significantly improve performance on long-tailed, multi-label classification tasks. The study evaluated methods on the CXR-LT 2026 benchmark containing approximately 143,000 images with 30 disease labels from PadChest, specifically focusing on improving recognition of clinically important but underrepresented findings.
The technical breakthrough comes from combining LDAM with deferred re-weighting (LDAM-DRW), which consistently outperformed standard binary cross-entropy and asymmetric losses for rare class recognition. Among tested architectures, ConvNeXt-Large achieved the best single-model performance with 0.5220 mAP and 0.3765 F1 on the development set, with further improvements from classifier re-training and test-time augmentation. On the official test leaderboard, the submission achieved 0.3950 mAP, ranking 5th among 68 participating teams with 1,528 total submissions. The paper provides candid analysis of the development-to-test performance gap and offers practical insights for handling class imbalance in real clinical imaging settings, with code made publicly available for further research and implementation.
- LDAM-DRW loss function outperformed standard methods by improving rare disease detection in imbalanced X-ray datasets
- ConvNeXt-Large architecture achieved best performance with 0.5220 mAP on development set using 143K images with 30 labels
- The system ranked 5th out of 68 teams in global challenge with 0.3950 mAP on official test set
Why It Matters
Improves AI's ability to spot rare but critical medical conditions that are often missed in imbalanced real-world datasets.