Abstract:
Although infrared cameras can operate day and night under all-weather conditions compared with visible cameras, the infrared images obtained by them have low resolution and signal-to-clutter ratio, lack of texture information, so enough labeled images and optimization model design have great influence on improving infrared target detection performance based on deep learning. First, to solve the lack of an infrared target detection dataset used for surveillance applications, an infrared camera was used to capture images with multiple polarities, and an image annotation task that outputted the VOC format was performed using our developed annotation software. An infrared image dataset containing two types of targets, person and vehicle, was constructed and named infrared-PV. The characteristics of the targets in this dataset were statistically analyzed. Second, state-of-the-art target detection models based on deep learning were adopted to perform model training and testing. Target detection performances for this dataset were qualitatively and quantitatively analyzed for the YOLO and Faster R-CNN series detection models. The constructed infrared dataset contained 2138 images, and the targets in this dataset included three types of modes: white hot, black hot, and heat map. In the benchmark test using several models, Cascade R-CNN achieves the best performance, where mean average precision when intersection over union exceeding 0.5 (mAP
0.5) reaches 82.3%, and YOLOv5 model can achieve the tradeoff between real-time performance and detection performance, where inference time achieves 175.4 frames per second and mAP
0.5 drops only 2.7%. The constructed infrared target detection dataset can provide data support for research on infrared image target detection model optimization and can also be used to analyze infrared target characteristics.