Abstract:
To address the challenges faced by current CNN-based infrared small-target detection methods in capturing long-range dependencies, as well as the high computational complexity and poor local feature learning of transformer-based methods, this study proposes an infrared small-target detection network algorithm called Split Rolling-Unet (SR-Unet) that combines MLP and CNN. Based on the Rolling-Unet, this algorithm adds a multiscale deep supervision fusion. By constructing the MSORMLP and LIEM, multidirectional long-range dependencies were captured while integrating the local context information. Comparative experiments were conducted on the public NUAA-SIRST and NUDT-SIRST datasets. The results demonstrate that SR-Unet contains only 2.01 M parameters while outperforming current mainstream infrared small-target detection algorithms across multiple evaluation metrics. Ablation experiments showed that the improved algorithm increased IoU from 0.7463 to 0.7851,
F1 score from 0.8547 to 0.8796, and Pd from 93.92% to 96.2%. SR-Unet demonstrates higher detection accuracy, a higher detection probability, and overall superior performance in infrared small-target detection tasks.