基于RCR-YOLO的红外多尺度目标检测算法

陈笑寒; 许媛媛

基于RCR-YOLO的红外多尺度目标检测算法

陈笑寒,
许媛媛^,

上海海事大学物流工程学院, 上海 201306

详细信息

作者简介:
陈笑寒（2000-），男，安徽合肥人，硕士研究生，研究方向：目标检测，红外图像处理。E-mail：2416724731@qq.com

通讯作者:
许媛媛（1980-），女，山东莱芜人，副教授，博士，研究方向：复杂系统多尺度建模与优化、深度学习及其应用。E-mail：yyxu@shmtu.edu.cn

中图分类号: TN219
计量
- 文章访问数: 118
- HTML全文浏览量: 40
- PDF下载量: 34
出版历程
- 收稿日期: 2024-02-28
- 修回日期: 2024-03-31
- 刊出日期: 2025-04-19

Infrared Multi-Scale Target Detection Algorithm Based on RCR-YOLO

CHEN Xiaohan,
XU Yuanyuan^,

Department of Logistics Engineering, Shanghai Maritime University, Shanghai 201306, China

摘要

摘要:
红外目标检测一直在军用和民用领域具有广泛的应用，目前针对在复杂背景下的红外多尺度目标检测中存在的漏检及误检问题，本文提出了一种改进的YOLOv5s算法RCR-YOLO。首先将原YOLOv5s的骨干网络CSPDarkNet53更换为ResNet50，避免了深层网络产生的梯度消失，增强了网络的特征提取能力，然后在骨干网络末端添加CA注意力机制模块，获取不同位置的特征信息，最终在颈部网络中加入Res2Net模块，通过引入多分支结构和逐级增加的分辨率来提高网络的表达能力并可以更好地处理多尺度特征信息，进而增强检测性能。实验结果表明，该方法优于Faster R-CNN、SSD、YOLOv3这些主流的目标检测算法，相较于YOLOv5s，在保持mAP₅₀为99.5%的基础上，将mAP_50-95提高了1.1%，拥有更好的检测效果，可以有效地完成复杂背景下的多尺度红外目标检测任务。
- 红外目标检测 /
- YOLOv5 /
- 深度学习 /
- 多尺度
Abstract:
Infrared target detection has been widely used in both military and civilian fields. To address the issues of missed and false detections in infrared multi-scale target detection under complex backgrounds, an improved YOLOv5s algorithm, RCR-YOLO, is proposed in this paper. First, the backbone network CSPDarkNet53 of the original YOLOv5s was replaced with ResNet50 to avoid gradient vanishing caused by the deep network and to enhance the network's feature extraction capability. Subsequently, the CA attention mechanism module was added to the end of the backbone to capture feature information from different locations. Finally, the Res2Net module was added to the neck network to improve the network's representational ability and process multi-scale feature information by introducing a multi-branch structure and progressively increasing resolution, thereby enhancing detection performance. Experimental results show that this method outperforms mainstream target detection algorithms such as Faster R-CNN, SSD, and YOLOv3. Compared to YOLOv5s, mAP50–95 increased by 1.1%, while mAP50 remained at 99.5%, indicating better detection performance. The algorithm effectively performs multi-scale infrared target detection under complex backgrounds.
- infrared target detection /
- YOLOv5 /
- deep learning /
- multi-scale

HTML全文

图 1 YOLOv5s网络结构

Figure 1. YOLOv5s network structure

下载: 全尺寸图片幻灯片

图 2 Conv Block和Identity Block结构

Figure 2. Conv Block and Identity Block structure

下载: 全尺寸图片幻灯片

图 3 ResNet50网络结构

Figure 3. ResNet50 network structure

下载: 全尺寸图片幻灯片

图 4 样本标签位置分布

Figure 4. Sample label location distribution

下载: 全尺寸图片幻灯片

图 5 CA编码

Figure 5. CA coding

下载: 全尺寸图片幻灯片

图 6 Res2Net模块

Figure 6. Res2Net module

下载: 全尺寸图片幻灯片

图 7 本文数据集部分样本

Figure 7. Part of the sample diagram of the data set in this paper

下载: 全尺寸图片幻灯片

图 8 YOLOv5s损失变化

Figure 8. YOLOv5s loss changes

下载: 全尺寸图片幻灯片

图 9 RCR-YOLO损失变化

Figure 9. RCR-YOLO loss changes

下载: 全尺寸图片幻灯片

图 10 Faster-RCNN（上）与RCR-YOLO（下）的检测结果对比

Figure 10. Comparison of detection results between Faster-RCNN(upper) and RCR-YOLO(down)

下载: 全尺寸图片幻灯片

图 11 SSD（上）与RCR-YOLO（下）的检测结果对比

Figure 11. Comparison of detection results between SSD(upper) and RCR-YOLO(down)

下载: 全尺寸图片幻灯片

图 12 YOLOv3（上）与RCR-YOLO（下）的检测结果对比

Figure 12. Comparison of detection results between YOLOv3(upper) and RCR-YOLO(down)

下载: 全尺寸图片幻灯片

图 13 YOLOv5s（上）与RCR-YOLO（下）的检测结果对比

Figure 13. Comparison of detection results between YOLOv5s(upper) and RCR-YOLO(down)

下载: 全尺寸图片幻灯片

表 1 实验训练参数

Table 1 Experimental training parameter

Parameters	Value
Epochs	100
Batch-size	16
Optimizer	SGD
Learning rate	0.01
Warmup_epochs	3
Weight_decay	0.0005

下载: 导出CSV

表 2 消融实验结果

Table 2 Ablation results

Model	Algorithm	AP₅₀/(%)		AP_50-95/(%)		P/%	R/%	mAP₅₀/%	mAP_50-95/(%)	FPS
Model	Algorithm	Aeroplane	Interference	Aeroplane	Interference	P/%	R/%	mAP₅₀/%	mAP_50-95/(%)	FPS
A	YOLOv5s	99.5	99.5	69.1	87.2	99.4	99.7	99.5	78.2	81.3
B	YOLOv5s+ResNet50	99.4	99.5	69.3	88.2	99.7	99.6	99.5	78.8	27
C	YOLOv5s+ResNet50+CA	99.5	99.5	69.4	88.4	99.5	99.8	99.5	78.9	28.2
D	YOLOv5s+ResNet50+CA+Res2Net(RCR-YOLO)	99.5	99.5	69.8	88.8	99.6	99.6	99.5	79.3	28.2

下载: 导出CSV

表 3 对比实验结果

Table 3 Comparative experimental results

Algorithm	AP₅₀/(%)		P/(%)	R/(%)	mAP₅₀/(%)	FPS
Algorithm	Aeroplane	Interference	P/(%)	R/(%)	mAP₅₀/(%)	FPS
Faster-RCNN	85.5	97.9	73.3	93.1	91.7	6.3
SSD	97.7	97.9	98.5	85.9	97.8	56.4
YOLOv3	98.7	97.5	97.1	92.8	98.1	18.4
RCR-YOLO	99.5	99.5	99.6	99.6	99.5	28.2

下载: 导出CSV

参考文献(29)

[1]	LI K, WANG J, Jalil H, et al. A fast and lightweight detection algorithm for passion fruit pests based on improved YOLOv5[J]. Computers and Electronics in Agriculture, 2023, 204: 107534. DOI: 10.1016/j.compag.2022.107534
[2]	ZHANG Y, GUO K. Power plant indicator light detection system based on improved YOLOv5[J]. Journal of Beijing Institute of Technology, 2022, 31(6): 605-612.
[3]	YANG H, FANG Y, LIU L, et al. Improved YOLOv5 based on feature fusion and attention mechanism and its application in continuous casting slab detection[J]. IEEE Transactions on Instrumentation and Measurement, 2023.
[4]	ZHONG S, ZHOU H, MA Z, et al. Multiscale contrast enhancement method for small infrared target detection[J]. Optik, 2022, 271: 170134. DOI: 10.1016/j.ijleo.2022.170134
[5]	贺顺, 谢永妮, 杨志伟, 等. 基于IHBF的增强局部对比度红外小目标检测方法[J]. 红外技术, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1 HE Shun, XIE Yongni, YANG Zhiwei, et al. IHBF-based enhanced local contrast measure method for infrared small target detection[J]. Infrared Technology, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1
[6]	JIANG C, REN H, YE X, et al. Object detection from UAV thermal infrared images and videos using YOLO models[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102912. DOI: 10.1016/j.jag.2022.102912
[7]	CAO S, WANG T, LI T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97: 103936. DOI: 10.1016/j.jvcir.2023.103936
[8]	LIU Z, GAO X, WAN Y, et al. An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365-14374. DOI: 10.1109/ACCESS.2023.3241005
[9]	Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, 1: 886-893.
[10]	Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.
[11]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[12]	Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[13]	REN Shaoqing, HE Kaiming, Ross Girshick, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[14]	HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. DOI: 10.1109/TPAMI.2015.2389824
[15]	LIU W, Anguelov D, Erhan D, et al. Ssd: single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, 2016: 21-37.
[16]	FU C Y, LIU W, Ranga A, et al. Dssd: deconvolutional single shot detector[J]. arXiv preprint arXiv:1701.06659, 2017.
[17]	Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection[J]. arXiv preprint arXiv:1705.09587, 2017.
[18]	LI Z, ZHOU F. FSSD: feature fusion single shot multibox detector[J]. arXiv preprint arXiv:1712.00960, 2017.
[19]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[20]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[21]	Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[22]	Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[23]	DING L, XU X, CAO Y, et al. Detection and tracking of infrared small target by jointly using SSD and pipeline filter[J]. Digital Signal Processing, 2021, 110: 102949. DOI: 10.1016/j.dsp.2020.102949
[24]	WEI J, SU S, ZHAO Z, et al. Infrared pedestrian detection using improved UNet and YOLO through sharing visible light domain information[J]. Measurement, 2023, 221: 113442. DOI: 10.1016/j.measurement.2023.113442
[25]	Terven Juan, Diana-Margarita Córdova-Esparza, et al. A comprehensive review of yolo architectures in computer vision: from yolov1 to yolov8 and yolo-nas[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716. DOI: 10.3390/make5040083
[26]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[27]	HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
[28]	GAO S H, CHENG M M, ZHAO K, et al. Res2net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(2): 652-662.
[29]	袁志安, 谷雨, 马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程, 2023, 50(12): 16-31. YUAN Zhian, GU Yu, MA Gan. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electronic Engineering, 2023, 50(12): 16-31.

施引文献(15)

期刊类型引用(5)

1.	李硕，韩迎东，王双，刘琨，江俊峰，刘铁根. 基于Pearson相关系数的图像误匹配点剔除算法. 激光与光电子学进展. 2021(08): 263-273 . 百度学术
2.	赵耀，熊智，田世伟，刘建业，崔雨晨. 基于SAR图像匹配结果可信度评价的INS/SAR自适应Kalman滤波算法. 航空学报. 2019(08): 216-227 . 百度学术
3.	李尊，申小萌，苗同军. 对比度阈值自适应的SIFT图像拼接算法. 红外技术. 2017(10): 946-950 . 本站查看
4.	林丽萍，张亚萍. 基于错配剔除的三维重建研究. 系统仿真学报. 2017(11): 2644-2648 . 百度学术
5.	杨雨薇，张亚萍. 一种改进的SIFT图像检测与特征匹配算法. 云南大学学报(自然科学版). 2017(03): 376-384 . 百度学术