Research & Papers

Selective Attention-Based Network for Robust Infrared Small Target Detection

New U-Net variant pinpoints sub-pixel targets with adaptive cross-scale fusion.

Deep Dive

Infrared small target detection (IRSTD) is crucial for applications like maritime surveillance, military search and rescue, and early warning systems, but faces fundamental challenges: targets often occupy only a few pixels, have low signal-to-clutter ratios, and are easily confused with complex backgrounds. Existing encoder-decoder architectures suffer from an information bottleneck in early convolution layers and static skip connections that lack adaptability.

To overcome these, researchers propose SANet (Selective Attention-based Network), a U-Net variant with two novel components. The Dual-path Semantic-aware Module (DSM) combines standard convolutions for local detail with pinwheel-shaped convolutions for direction-sensitive context, followed by a Convolutional Block Attention Module (CBAM) for fine-grained spatial-channel recalibration. The Selective Attention Fusion Module (SAFM) replaces traditional skip connections with a learnable, spatially adaptive weighting mechanism that fuses cross-scale features based on context, enabling robust discrimination of true targets from pseudo-target regions. This approach significantly improves detection accuracy in cluttered infrared scenes.

Key Points
  • Uses pinwheel-shaped convolutions in DSM to expand receptive fields directionally, capturing contextual clues for sub-pixel targets
  • SAFM replaces static skip connections with learnable, spatially adaptive weights for context-aware cross-scale feature fusion
  • Achieves robust detection of few-pixel targets (typical size ≤ 3x3 pixels) with low signal-to-clutter ratios in cluttered infrared backgrounds

Why It Matters

Enables reliable detection of tiny, dim targets in noisy IR scenes, critical for defense and surveillance systems.