基于RCR-YOLO的红外多尺度目标检测算法

陈笑寒, 许媛媛

陈笑寒, 许媛媛. 基于RCR-YOLO的红外多尺度目标检测算法[J]. 红外技术, 2025, 47(4): 459-467.
引用本文: 陈笑寒, 许媛媛. 基于RCR-YOLO的红外多尺度目标检测算法[J]. 红外技术, 2025, 47(4): 459-467.
CHEN Xiaohan, XU Yuanyuan. Infrared Multi-Scale Target Detection Algorithm Based on RCR-YOLO[J]. Infrared Technology , 2025, 47(4): 459-467.
Citation: CHEN Xiaohan, XU Yuanyuan. Infrared Multi-Scale Target Detection Algorithm Based on RCR-YOLO[J]. Infrared Technology , 2025, 47(4): 459-467.

基于RCR-YOLO的红外多尺度目标检测算法

详细信息
    作者简介:

    陈笑寒(2000-),男,安徽合肥人,硕士研究生,研究方向:目标检测,红外图像处理。E-mail:2416724731@qq.com

    通讯作者:

    许媛媛(1980-),女,山东莱芜人,副教授,博士,研究方向:复杂系统多尺度建模与优化、深度学习及其应用。E-mail:yyxu@shmtu.edu.cn

  • 中图分类号: TN219

Infrared Multi-Scale Target Detection Algorithm Based on RCR-YOLO

  • 摘要:

    红外目标检测一直在军用和民用领域具有广泛的应用,目前针对在复杂背景下的红外多尺度目标检测中存在的漏检及误检问题,本文提出了一种改进的YOLOv5s算法RCR-YOLO。首先将原YOLOv5s的骨干网络CSPDarkNet53更换为ResNet50,避免了深层网络产生的梯度消失,增强了网络的特征提取能力,然后在骨干网络末端添加CA注意力机制模块,获取不同位置的特征信息,最终在颈部网络中加入Res2Net模块,通过引入多分支结构和逐级增加的分辨率来提高网络的表达能力并可以更好地处理多尺度特征信息,进而增强检测性能。实验结果表明,该方法优于Faster R-CNN、SSD、YOLOv3这些主流的目标检测算法,相较于YOLOv5s,在保持mAP50为99.5%的基础上,将mAP50-95提高了1.1%,拥有更好的检测效果,可以有效地完成复杂背景下的多尺度红外目标检测任务。

    Abstract:

    Infrared target detection has been widely used in both military and civilian fields. To address the issues of missed and false detections in infrared multi-scale target detection under complex backgrounds, an improved YOLOv5s algorithm, RCR-YOLO, is proposed in this paper. First, the backbone network CSPDarkNet53 of the original YOLOv5s was replaced with ResNet50 to avoid gradient vanishing caused by the deep network and to enhance the network's feature extraction capability. Subsequently, the CA attention mechanism module was added to the end of the backbone to capture feature information from different locations. Finally, the Res2Net module was added to the neck network to improve the network's representational ability and process multi-scale feature information by introducing a multi-branch structure and progressively increasing resolution, thereby enhancing detection performance. Experimental results show that this method outperforms mainstream target detection algorithms such as Faster R-CNN, SSD, and YOLOv3. Compared to YOLOv5s, mAP50–95 increased by 1.1%, while mAP50 remained at 99.5%, indicating better detection performance. The algorithm effectively performs multi-scale infrared target detection under complex backgrounds.

  • 图  1   YOLOv5s网络结构

    Figure  1.   YOLOv5s network structure

    图  2   Conv Block和Identity Block结构

    Figure  2.   Conv Block and Identity Block structure

    图  3   ResNet50网络结构

    Figure  3.   ResNet50 network structure

    图  4   样本标签位置分布

    Figure  4.   Sample label location distribution

    图  5   CA编码

    Figure  5.   CA coding

    图  6   Res2Net模块

    Figure  6.   Res2Net module

    图  7   本文数据集部分样本

    Figure  7.   Part of the sample diagram of the data set in this paper

    图  8   YOLOv5s损失变化

    Figure  8.   YOLOv5s loss changes

    图  9   RCR-YOLO损失变化

    Figure  9.   RCR-YOLO loss changes

    图  10   Faster-RCNN(上)与RCR-YOLO(下)的检测结果对比

    Figure  10.   Comparison of detection results between Faster-RCNN(upper) and RCR-YOLO(down)

    图  11   SSD(上)与RCR-YOLO(下)的检测结果对比

    Figure  11.   Comparison of detection results between SSD(upper) and RCR-YOLO(down)

    图  12   YOLOv3(上)与RCR-YOLO(下)的检测结果对比

    Figure  12.   Comparison of detection results between YOLOv3(upper) and RCR-YOLO(down)

    图  13   YOLOv5s(上)与RCR-YOLO(下)的检测结果对比

    Figure  13.   Comparison of detection results between YOLOv5s(upper) and RCR-YOLO(down)

    表  1   实验训练参数

    Table  1   Experimental training parameter

    Parameters Value
    Epochs 100
    Batch-size 16
    Optimizer SGD
    Learning rate 0.01
    Warmup_epochs 3
    Weight_decay 0.0005
    下载: 导出CSV

    表  2   消融实验结果

    Table  2   Ablation results

    Model Algorithm AP50/(%) AP50-95/(%) P/% R/% mAP50/% mAP50-95/(%) FPS
    Aeroplane Interference Aeroplane Interference
    A YOLOv5s 99.5 99.5 69.1 87.2 99.4 99.7 99.5 78.2 81.3
    B YOLOv5s+ResNet50 99.4 99.5 69.3 88.2 99.7 99.6 99.5 78.8 27
    C YOLOv5s+ResNet50+CA 99.5 99.5 69.4 88.4 99.5 99.8 99.5 78.9 28.2
    D YOLOv5s+ResNet50+CA+Res2Net(RCR-YOLO) 99.5 99.5 69.8 88.8 99.6 99.6 99.5 79.3 28.2
    下载: 导出CSV

    表  3   对比实验结果

    Table  3   Comparative experimental results

    Algorithm AP50/(%) P/(%) R/(%) mAP50/(%) FPS
    Aeroplane Interference
    Faster-RCNN 85.5 97.9 73.3 93.1 91.7 6.3
    SSD 97.7 97.9 98.5 85.9 97.8 56.4
    YOLOv3 98.7 97.5 97.1 92.8 98.1 18.4
    RCR-YOLO 99.5 99.5 99.6 99.6 99.5 28.2
    下载: 导出CSV
  • [1]

    LI K, WANG J, Jalil H, et al. A fast and lightweight detection algorithm for passion fruit pests based on improved YOLOv5[J]. Computers and Electronics in Agriculture, 2023, 204: 107534. DOI: 10.1016/j.compag.2022.107534

    [2]

    ZHANG Y, GUO K. Power plant indicator light detection system based on improved YOLOv5[J]. Journal of Beijing Institute of Technology, 2022, 31(6): 605-612.

    [3]

    YANG H, FANG Y, LIU L, et al. Improved YOLOv5 based on feature fusion and attention mechanism and its application in continuous casting slab detection[J]. IEEE Transactions on Instrumentation and Measurement, 2023.

    [4]

    ZHONG S, ZHOU H, MA Z, et al. Multiscale contrast enhancement method for small infrared target detection[J]. Optik, 2022, 271: 170134. DOI: 10.1016/j.ijleo.2022.170134

    [5] 贺顺, 谢永妮, 杨志伟, 等. 基于IHBF的增强局部对比度红外小目标检测方法[J]. 红外技术, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1

    HE Shun, XIE Yongni, YANG Zhiwei, et al. IHBF-based enhanced local contrast measure method for infrared small target detection[J]. Infrared Technology, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1

    [6]

    JIANG C, REN H, YE X, et al. Object detection from UAV thermal infrared images and videos using YOLO models[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102912. DOI: 10.1016/j.jag.2022.102912

    [7]

    CAO S, WANG T, LI T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97: 103936. DOI: 10.1016/j.jvcir.2023.103936

    [8]

    LIU Z, GAO X, WAN Y, et al. An improved YOLOv5 method for small object detection in UAV capture scenes[J]. IEEE Access, 2023, 11: 14365-14374. DOI: 10.1109/ACCESS.2023.3241005

    [9]

    Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, 1: 886-893.

    [10]

    Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]//2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8.

    [11]

    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

    [12]

    Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.

    [13]

    REN Shaoqing, HE Kaiming, Ross Girshick, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.

    [14]

    HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. DOI: 10.1109/TPAMI.2015.2389824

    [15]

    LIU W, Anguelov D, Erhan D, et al. Ssd: single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, 2016: 21-37.

    [16]

    FU C Y, LIU W, Ranga A, et al. Dssd: deconvolutional single shot detector[J]. arXiv preprint arXiv:1701.06659, 2017.

    [17]

    Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection[J]. arXiv preprint arXiv:1705.09587, 2017.

    [18]

    LI Z, ZHOU F. FSSD: feature fusion single shot multibox detector[J]. arXiv preprint arXiv:1712.00960, 2017.

    [19]

    Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.

    [20]

    Redmon J, Farhadi A. YOLO9000: better, faster, stronger [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.

    [21]

    Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

    [22]

    Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.

    [23]

    DING L, XU X, CAO Y, et al. Detection and tracking of infrared small target by jointly using SSD and pipeline filter[J]. Digital Signal Processing, 2021, 110: 102949. DOI: 10.1016/j.dsp.2020.102949

    [24]

    WEI J, SU S, ZHAO Z, et al. Infrared pedestrian detection using improved UNet and YOLO through sharing visible light domain information[J]. Measurement, 2023, 221: 113442. DOI: 10.1016/j.measurement.2023.113442

    [25]

    Terven Juan, Diana-Margarita Córdova-Esparza, et al. A comprehensive review of yolo architectures in computer vision: from yolov1 to yolov8 and yolo-nas[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716. DOI: 10.3390/make5040083

    [26]

    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.

    [27]

    HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.

    [28]

    GAO S H, CHENG M M, ZHAO K, et al. Res2net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(2): 652-662.

    [29] 袁志安, 谷雨, 马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程, 2023, 50(12): 16-31.

    YUAN Zhian, GU Yu, MA Gan. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electronic Engineering, 2023, 50(12): 16-31.

  • 期刊类型引用(5)

    1. 李硕,韩迎东,王双,刘琨,江俊峰,刘铁根. 基于Pearson相关系数的图像误匹配点剔除算法. 激光与光电子学进展. 2021(08): 263-273 . 百度学术
    2. 赵耀,熊智,田世伟,刘建业,崔雨晨. 基于SAR图像匹配结果可信度评价的INS/SAR自适应Kalman滤波算法. 航空学报. 2019(08): 216-227 . 百度学术
    3. 李尊,申小萌,苗同军. 对比度阈值自适应的SIFT图像拼接算法. 红外技术. 2017(10): 946-950 . 本站查看
    4. 林丽萍,张亚萍. 基于错配剔除的三维重建研究. 系统仿真学报. 2017(11): 2644-2648 . 百度学术
    5. 杨雨薇,张亚萍. 一种改进的SIFT图像检测与特征匹配算法. 云南大学学报(自然科学版). 2017(03): 376-384 . 百度学术

    其他类型引用(10)

图(13)  /  表(3)
计量
  • 文章访问数:  118
  • HTML全文浏览量:  40
  • PDF下载量:  34
  • 被引次数: 15
出版历程
  • 收稿日期:  2024-02-28
  • 修回日期:  2024-03-31
  • 刊出日期:  2025-04-19

目录

    /

    返回文章
    返回