Disentangled Learning Improves Implicit Neural Representations for Medical Reconstruction
Disentangled learning boosts MRI and CT scan accuracy using 70% less training data.
Implicit neural representations (INRs) have shown promise for medical image reconstruction, but classical methods train a full network per patient from scratch, which is slow and yields suboptimal quality. Existing pre-training approaches struggle with catastrophic forgetting and require high-quality reference images. To address this, a team led by Qing Wu (including Xuanyu Tian, Chenhe Du, Haonan Zhang, Xiao Wang, Le Lu, and Yuyao Zhang) developed DisINR—a framework that explicitly disentangles shared anatomical priors from patient-specific features.
DisINR uses a shared encoder-decoder alongside subject-specific encoders. The shared components are pre-trained directly on raw sensor measurements using differentiable forward models, eliminating the need for clean ground-truth images. At test time, only the lightweight subject-specific encoder is optimized while the shared pair remains frozen, preventing catastrophic forgetting. In evaluations on three representative tasks (e.g., MRI, CT, and PET), DisINR achieved up to 2x faster convergence and significantly higher PSNR/SSIM metrics compared to state-of-the-art INRs. This approach reduces computational overhead and improves generalizability across diverse patients and limited data regimes.
- DisINR's shared encoder-decoder is pre-trained on raw measurements, avoiding high-quality labeled images and reducing data requirements by up to 70%.
- Test-time adaptation only optimizes a small per-subject encoder (≈1% of total parameters), preserving learned population priors and cutting fine-tuning time by 3x.
- Outperforms state-of-the-art INRs on MRI, CT, and PET reconstruction tasks with 1–2 dB higher PSNR and 0.05 higher SSIM on average.
Why It Matters
Faster, more accurate medical image reconstruction with less data means lower costs and better diagnosis for more patients.