Abstract:
Object detection is a fundamental task in computer vision. Drones equipped with infrared cameras facilitate nighttime reconnaissance and surveillance. To realize small target detection, slight texture information, weak contrast in infrared aerial photography scenes, limited accuracy of traditional algorithms, and heavy dependence on computing power and power consumption in infrared object detection, a pedestrian detection method for infrared aerial photography scenes that integrates salient images is proposed. First, we use U2-Net to generate saliency maps from the original thermal infrared images for image enhancement. Second, we analyze the impact of two fusion methods, pixel-level weighted fusion, and replacement of image channels as image-enhancement schemes. Finally, to improve the adaptability of the algorithm to the target scene, the prior boxes are reclustered. The experimental results show that pixel-level weighted fusion presents excellent results. This method improves the average accuracy of typical YOLOv3, YOLOv3-tiny, and YOLOv4-tiny algorithms by 6.5%, 7.6%, and 6.2%, respectively, demonstrating the effectiveness of the designed fused visual saliency method.