Infrared-PV:面向监控应用的红外目标检测数据集

Infrared-PV: an Infrared Target Detection Dataset for Surveillance Application

  • 摘要: 红外摄像机虽然能够全天候24 h工作,但是相比于可见光摄像机,其获得的红外图像分辨率和信杂比低,目标纹理信息缺乏,因此足够的标记图像和进行模型优化设计对于提高基于深度学习的红外目标检测性能具有重要意义。为解决面向监控应用场景的红外目标检测数据集缺乏的问题,首先采用红外摄像机采集了不同极性的红外图像,基于自研图像标注软件实现了VOC格式的图像标注任务,构建了一个包含行人和车辆两类目标的红外图像数据集(Infrared-PV),并对数据集中的目标特性进行了统计分析。然后采用主流的基于深度学习的目标检测模型进行了模型训练与测试,定性和定量分析了YOLO系列和Faster R-CNN系列等模型对于该数据集的目标检测性能。构建的红外目标数据集共包含图像2138张,场景中目标包含白热、黑热和热力图3种模式。当采用各模型进行目标检测性能测试时,Cascade R-CNN模型性能最优,mAP0.5值达到了82.3%,YOLO v5系列模型能够兼顾实时性和检测精度的平衡,推理速度达到175.4帧/s的同时mAP0.5值仅降低2.7%。构建的红外目标检测数据集能够为基于深度学习的红外图像目标检测模型优化研究提供一定的数据支撑,同时也可以用于目标的红外特性分析。

     

    Abstract: Although infrared cameras can operate day and night under all-weather conditions compared with visible cameras, the infrared images obtained by them have low resolution and signal-to-clutter ratio, lack of texture information, so enough labeled images and optimization model design have great influence on improving infrared target detection performance based on deep learning. First, to solve the lack of an infrared target detection dataset used for surveillance applications, an infrared camera was used to capture images with multiple polarities, and an image annotation task that outputted the VOC format was performed using our developed annotation software. An infrared image dataset containing two types of targets, person and vehicle, was constructed and named infrared-PV. The characteristics of the targets in this dataset were statistically analyzed. Second, state-of-the-art target detection models based on deep learning were adopted to perform model training and testing. Target detection performances for this dataset were qualitatively and quantitatively analyzed for the YOLO and Faster R-CNN series detection models. The constructed infrared dataset contained 2138 images, and the targets in this dataset included three types of modes: white hot, black hot, and heat map. In the benchmark test using several models, Cascade R-CNN achieves the best performance, where mean average precision when intersection over union exceeding 0.5 (mAP0.5) reaches 82.3%, and YOLOv5 model can achieve the tradeoff between real-time performance and detection performance, where inference time achieves 175.4 frames per second and mAP0.5 drops only 2.7%. The constructed infrared target detection dataset can provide data support for research on infrared image target detection model optimization and can also be used to analyze infrared target characteristics.

     

/

返回文章
返回