Infrared Multi-Scale Target Detection Algorithm Based on RCR-YOLO
-
摘要:
红外目标检测一直在军用和民用领域具有广泛的应用,目前针对在复杂背景下的红外多尺度目标检测中存在的漏检及误检问题,本文提出了一种改进的YOLOv5s算法RCR-YOLO。首先将原YOLOv5s的骨干网络CSPDarkNet53更换为ResNet50,避免了深层网络产生的梯度消失,增强了网络的特征提取能力,然后在骨干网络末端添加CA注意力机制模块,获取不同位置的特征信息,最终在颈部网络中加入Res2Net模块,通过引入多分支结构和逐级增加的分辨率来提高网络的表达能力并可以更好地处理多尺度特征信息,进而增强检测性能。实验结果表明,该方法优于Faster R-CNN、SSD、YOLOv3这些主流的目标检测算法,相较于YOLOv5s,在保持mAP50为99.5%的基础上,将mAP50-95提高了1.1%,拥有更好的检测效果,可以有效地完成复杂背景下的多尺度红外目标检测任务。
Abstract:Infrared target detection has been widely used in both military and civilian fields. To address the issues of missed and false detections in infrared multi-scale target detection under complex backgrounds, an improved YOLOv5s algorithm, RCR-YOLO, is proposed in this paper. First, the backbone network CSPDarkNet53 of the original YOLOv5s was replaced with ResNet50 to avoid gradient vanishing caused by the deep network and to enhance the network's feature extraction capability. Subsequently, the CA attention mechanism module was added to the end of the backbone to capture feature information from different locations. Finally, the Res2Net module was added to the neck network to improve the network's representational ability and process multi-scale feature information by introducing a multi-branch structure and progressively increasing resolution, thereby enhancing detection performance. Experimental results show that this method outperforms mainstream target detection algorithms such as Faster R-CNN, SSD, and YOLOv3. Compared to YOLOv5s, mAP50–95 increased by 1.1%, while mAP50 remained at 99.5%, indicating better detection performance. The algorithm effectively performs multi-scale infrared target detection under complex backgrounds.
-
Keywords:
- infrared target detection /
- YOLOv5 /
- deep learning /
- multi-scale
-
-
表 1 实验训练参数
Table 1 Experimental training parameter
Parameters Value Epochs 100 Batch-size 16 Optimizer SGD Learning rate 0.01 Warmup_epochs 3 Weight_decay 0.0005 表 2 消融实验结果
Table 2 Ablation results
Model Algorithm AP50/(%) AP50-95/(%) P/% R/% mAP50/% mAP50-95/(%) FPS Aeroplane Interference Aeroplane Interference A YOLOv5s 99.5 99.5 69.1 87.2 99.4 99.7 99.5 78.2 81.3 B YOLOv5s+ResNet50 99.4 99.5 69.3 88.2 99.7 99.6 99.5 78.8 27 C YOLOv5s+ResNet50+CA 99.5 99.5 69.4 88.4 99.5 99.8 99.5 78.9 28.2 D YOLOv5s+ResNet50+CA+Res2Net(RCR-YOLO) 99.5 99.5 69.8 88.8 99.6 99.6 99.5 79.3 28.2 表 3 对比实验结果
Table 3 Comparative experimental results
Algorithm AP50/(%) P/(%) R/(%) mAP50/(%) FPS Aeroplane Interference Faster-RCNN 85.5 97.9 73.3 93.1 91.7 6.3 SSD 97.7 97.9 98.5 85.9 97.8 56.4 YOLOv3 98.7 97.5 97.1 92.8 98.1 18.4 RCR-YOLO 99.5 99.5 99.6 99.6 99.5 28.2 -
[1] LI K, WANG J, Jalil H, et al. A fast and lightweight detection algorithm for passion fruit pests based on improved YOLOv5[J]. Computers and Electronics in Agriculture, 2023, 204: 107534. DOI: 10.1016/j.compag.2022.107534
[2] ZHANG Y, GUO K. Power plant indicator light detection system based on improved YOLOv5[J]. Journal of Beijing Institute of Technology, 2022, 31(6): 605-612.
[3] YANG H, FANG Y, LIU L, et al. Improved YOLOv5 based on feature fusion and attention mechanism and its application in continuous casting slab detection[J]. IEEE Transactions on Instrumentation and Measurement, 2023.
[4] ZHONG S, ZHOU H, MA Z, et al. Multiscale contrast enhancement method for small infrared target detection[J]. Optik, 2022, 271: 170134. DOI: 10.1016/j.ijleo.2022.170134
[5] 贺顺, 谢永妮, 杨志伟, 等. 基于IHBF的增强局部对比度红外小目标检测方法[J]. 红外技术, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1 HE Shun, XIE Yongni, YANG Zhiwei, et al. IHBF-based enhanced local contrast measure method for infrared small target detection[J]. Infrared Technology, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1
[6] JIANG C, REN H, YE X, et al. Object detection from UAV thermal infrared images and videos using YOLO models[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102912. DOI: 10.1016/j.jag.2022.102912
[7] CAO S, WANG T, LI T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97: 103936. DOI: 10.1016/j.jvcir.2023.103936
[8] LIU Z, GAO X, WAN Y, et al. An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365-14374. DOI: 10.1109/ACCESS.2023.3241005
[9] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, 1: 886-893.
[10] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.
[11] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[12] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[13] REN Shaoqing, HE Kaiming, Ross Girshick, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[14] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. DOI: 10.1109/TPAMI.2015.2389824
[15] LIU W, Anguelov D, Erhan D, et al. Ssd: single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, 2016: 21-37.
[16] FU C Y, LIU W, Ranga A, et al. Dssd: deconvolutional single shot detector[J]. arXiv preprint arXiv:1701.06659, 2017.
[17] Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection[J]. arXiv preprint arXiv:1705.09587, 2017.
[18] LI Z, ZHOU F. FSSD: feature fusion single shot multibox detector[J]. arXiv preprint arXiv:1712.00960, 2017.
[19] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[20] Redmon J, Farhadi A. YOLO9000: better, faster, stronger [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[21] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[22] Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[23] DING L, XU X, CAO Y, et al. Detection and tracking of infrared small target by jointly using SSD and pipeline filter[J]. Digital Signal Processing, 2021, 110: 102949. DOI: 10.1016/j.dsp.2020.102949
[24] WEI J, SU S, ZHAO Z, et al. Infrared pedestrian detection using improved UNet and YOLO through sharing visible light domain information[J]. Measurement, 2023, 221: 113442. DOI: 10.1016/j.measurement.2023.113442
[25] Terven Juan, Diana-Margarita Córdova-Esparza, et al. A comprehensive review of yolo architectures in computer vision: from yolov1 to yolov8 and yolo-nas[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716. DOI: 10.3390/make5040083
[26] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[27] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
[28] GAO S H, CHENG M M, ZHAO K, et al. Res2net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(2): 652-662.
[29] 袁志安, 谷雨, 马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程, 2023, 50(12): 16-31. YUAN Zhian, GU Yu, MA Gan. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electronic Engineering, 2023, 50(12): 16-31.
-
期刊类型引用(5)
1. 李硕,韩迎东,王双,刘琨,江俊峰,刘铁根. 基于Pearson相关系数的图像误匹配点剔除算法. 激光与光电子学进展. 2021(08): 263-273 . 百度学术
2. 赵耀,熊智,田世伟,刘建业,崔雨晨. 基于SAR图像匹配结果可信度评价的INS/SAR自适应Kalman滤波算法. 航空学报. 2019(08): 216-227 . 百度学术
3. 李尊,申小萌,苗同军. 对比度阈值自适应的SIFT图像拼接算法. 红外技术. 2017(10): 946-950 . 本站查看
4. 林丽萍,张亚萍. 基于错配剔除的三维重建研究. 系统仿真学报. 2017(11): 2644-2648 . 百度学术
5. 杨雨薇,张亚萍. 一种改进的SIFT图像检测与特征匹配算法. 云南大学学报(自然科学版). 2017(03): 376-384 . 百度学术
其他类型引用(10)