FEFormer: Frequency-Enhanced ViT Outperforms on Medical Image Segmentation
New architecture blends frequency decomposition with transformers for sharper organ and lesion boundaries.
A new paper on arXiv from Jin Yang and team introduces FEFormer (Frequency-enhanced Vision Transformer), designed to overcome key limitations of standard Vision Transformers (ViTs) in medical image segmentation. While ViTs excel at capturing global context, they struggle with fine-grained local features critical for anatomical structures. FEFormer addresses this by incorporating frequency-domain processing into its core modules. The architecture includes a Frequency-enhanced Dynamic Self-Attention (FDSA) module that uses locality-preserving convolution with frequency-domain attention to jointly capture local details and global dependencies. A Frequency-decomposed Gating MLP (FGMLP) adaptively models low- and high-frequency components, enhancing both semantic and structural representations.
Additionally, FEFormer features a Wavelet-guided Adaptive Feature Fusion (WAFF) module for semantically consistent encoder-decoder integration in the frequency domain, and a Frequency-enabled Cross-scale Stem Bridge (FCSB) that improves low-level feature propagation across scales. The model was evaluated on four diverse volumetric medical image segmentation tasks and achieved superior performance compared to existing methods while maintaining high computational efficiency. By explicitly handling frequency information, FEFormer represents a promising direction for improving segmentation accuracy in clinical applications such as diagnosis and treatment planning.
- FEFormer introduces four novel frequency-aware modules: FDSA, FGMLP, WAFF, and FCSB.
- It captures both fine-grained local details and global context through frequency-domain attention.
- Outperforms SOTA on four volumetric medical image segmentation tasks with lower computational cost.
Why It Matters
More precise segmentation of organs and lesions could improve diagnosis, prognosis, and treatment planning in clinical settings.