Abstract:
Owing to the scarcity of pixel values and limited color features in infrared street images, issues such as missed detections, false detections, and poor detection performance are common. To address these problems, a spatially adaptive and content-aware infrared small object detection algorithm is proposed. The key components of this algorithm are as follows. 1) Spatially adaptive transformer: This transformer is designed by stacking local attention and deformable attention mechanisms to enhance the modeling capability of long-range dependency features and capture more spatial positional information. 2) Content-aware reassembly of features (CARAFE) operator: This operator is used for feature upsampling, aggregating contextual information within a large receptive field, and adaptively recombining features using shallow-level information. 3) High-resolution prediction head: A high-resolution prediction head of size 160x160 is added to map the pixels of input features to finer detection regions, further improving the detection performance of small objects. Experimental results on the FLIR dataset demonstrate that the proposed algorithm achieves an average precision mean of 85.6%, representing a 3.9% improvement over the YOLOX-s algorithm. These results validate the superiority of the proposed algorithm in detecting small objects in infrared images.