Abstract:
Detecting small targets in infrared images is challenging owing to complex backgrounds, low signal-to-noise ratios, small target sizes, and weak brightness. To address these challenges, a lightweight infrared small target detection algorithm, Infrared Small Target Detection–You Only Look Once (ISTD–YOLO), is proposed based on YOLOv7s. The YOLOv7s network structure is reconstructed in a lightweight manner by adjusting the feature extraction and feature fusion networks, and a three-scale lightweight architecture is designed to improve the detection performance of small targets. Next, the VoV-GSCSP module is adopted to replace the ELAN-W module in the neck network, aiming to reduce computational cost and network complexity while improving inference speed. In addition, a non-parametric attention mechanism is incorporated into the neck network to strengthen local contextual correlations and enable more accurate target localization. Finally, the Normalized Gaussian Wasserstein Distance (NWD) is employed to optimize the commonly used IoU metric for calculating the overlap relationship between predicted box and the ground-truth boxes, thereby enhancing the accuracy of small target localization and detection. Experimental results demonstrate that ISTD-YOLO effectively improves detection performance. Compared with the baseline model, detection accuracy on the HIT-UAV and IDSAT datasets increased by 8.52% and 4.77%, respectively. The model volume is only 21.8 MB, with a 69.8% reduction in parameter count and a 17.6% decrease in computational complexity. Furthermore, compared with current mainstream algorithms, ISTD-YOLO significantly improves multiple performance indicators and achieves high-quality detection of small infrared targets.