基于YOLO v7的轻量级红外目标检测算法

陈永麟, 王恒涛, 张上

陈永麟, 王恒涛, 张上. 基于YOLO v7的轻量级红外目标检测算法[J]. 红外技术, 2024, 46(12): 1380-1389.
引用本文: 陈永麟, 王恒涛, 张上. 基于YOLO v7的轻量级红外目标检测算法[J]. 红外技术, 2024, 46(12): 1380-1389.
CHEN Yonglin, WANG Hengtao, ZHANG Shang. Lightweight Infrared Target Detection Algorithm Based on YOLO v7[J]. Infrared Technology , 2024, 46(12): 1380-1389.
Citation: CHEN Yonglin, WANG Hengtao, ZHANG Shang. Lightweight Infrared Target Detection Algorithm Based on YOLO v7[J]. Infrared Technology , 2024, 46(12): 1380-1389.

基于YOLO v7的轻量级红外目标检测算法

基金项目: 

国家级大学生创新创业训练计划 202111075012

国家级大学生创新创业训练计划 202011075013

详细信息
    作者简介:

    陈永麟(1999-),男,湖北荆门人,硕士研究生,研究方向为目标检测,E-mail: 1768859718@qq.com

    通讯作者:

    张上(1979-),男,湖北宜昌人,副教授,工学博士,研究方向为物联网技术、计算机应用技术,E-mail: 3011408157@qq.com

  • 中图分类号: TP391.4

Lightweight Infrared Target Detection Algorithm Based on YOLO v7

  • 摘要:

    针对红外图像信噪比低、分辨率不佳、噪声与杂波多等检测难点。提出一种基于YOLOv7的轻量化红外图像目标检测算法ITD-YOLO。首先,ITD-YOLO算法重设计网络结构,对特征提取网络与特征融合网络架构重新调整。裁剪掉原网络中深层对应的大感受野,依据重构后网络特征图输出,对模型预设锚框进行调节。改变多尺度特征融合中的深层特征与浅层特征的关系,提高浅层网络提取的细节信息在融合中所占的权重,提高对较小目标的检测性能;然后,在ELAN模块中引入PConv替换掉常规卷积,进一步降低模型计算量。其次,将模型损失函数调整为PolyLoss以加速模型收敛,进一步加强对目标的检测性能;最后,使用SIoU作为边框损失函数,增强对目标的定位精度。实验结果表明,ITB-YOLO能够有效改善检测效果,在FLIR与OSU数据集上,相较于YOLOv7s的平均精度均值分别提高2.27%与7.29%。改进后得到的模型体积仅为17.7 MB,计算量下降37.11%。与主流算法进行对比,ITD-YOLO在各项指标均得到了一定程度的提高,能够满足红外目标实时检测任务。

    Abstract:

    Aiming at the detection difficulties of infrared images such as low signal-to-noise ratio, poor resolution, and much noise and clutter. We propose a lightweight infrared image target detection algorithm ITD-YOLO based on YOLOv7. Firstly, the ITD-YOLO algorithm redesigns the network structure, and re-adjusts the architecture of the feature extraction network and the feature fusion network. Crop out the large receptive fields corresponding to the deep layers in the original network, and adjust the model preset anchor frames based on the output of the reconstructed network feature map. The relationship between deep and shallow features in multi-scale feature fusion is changed to increase the weight of the detail information extracted by the shallow network in the fusion to improve the detection performance of smaller targets; then, PConv is introduced into the ELAN module to replace the conventional convolution to further reduce the model computation. Next, the model loss function is adjusted to PolyLoss to accelerate the model convergence and further enhance the detection performance for targets; finally, SIoU is used as the edge loss function to enhance the localisation accuracy for targets. The experimental results show that ITB-YOLO can effectively improve the detection effect, and the mean average accuracy is increased by 2.27% and 7.29% compared with YOLOv7s on FLIR and OSU datasets, respectively. The volume of the model obtained after the improvement is only 17.7 MB, and the computation volume decreases by 37.11%. Comparing with the mainstream algorithms, ITD-YOLO has been improved to a certain extent in all the indexes, and can meet the real-time infrared target detection task.

  • 图  1   ELAN网络结构

    Figure  1.   ELAN network structure

    图  2   ELAN-W网络结构

    Figure  2.   ELAN-W network structure

    图  3   ITD-YOLO系统架构图

    Figure  3.   ITD-YOLO system architecture

    图  4   特征融合网络结构

    Figure  4.   Feature fusion network structure

    图  5   三种卷积结构对比

    Figure  5.   Comparison of 3 types of convolution structure

    图  6   ELAN-P网络结构

    Figure  6.   Structure of the ELAN-P network

    图  7   ITD -YOLO算法结构

    Figure  7.   Detail of the ITD-YOLO algorithm structure

    图  8   检测效果对比

    Figure  8.   Comparison of detection results

    表  1   重构CSPDarkNet结构

    Table  1   Reconfiguration of the CSPDarkNet structure

    Module Parameters Channel Kernel size Output
    CBS 928 32 (3, 3) 640×640
    CBS 18560 64 (3, 3) 320×320
    CBS 36992 64 (3, 3) 320×320
    CBS 73984 128 (3, 3) 160×160
    ELAN-P 108800 256 80×80
    MP-1 213760 256 80×80
    ELAN-P 432640 512 80×80
    MP-1 853504 512 40×40
    ELAN-P 1725440 1024 40×40
    下载: 导出CSV

    表  2   锚定框分配表

    Table  2   Table of anchor box assignments

    Feature map 40×40 80×80
    Receptive field Medium Small
      FLIR (15, 16) (23, 61)
    (14, 31) (54, 44)
    (30, 26) (102, 86)
      OSU (30, 40) (33, 44)
    (36, 47) (37, 50)
    (39, 52) (42, 55)
    下载: 导出CSV

    表  3   消融实验

    Table  3   Ablation experiments

    Algorithm Reconstruction +PConv PolyLoss SIoU FLIR OSU Volume/MB Parameters GFLOPs
    P/(%) R/(%) mAP/(%) P/(%) R/(%) mAP/(%)
    YOLOv7s 86.10 84.12 89.97 89.49 91.01 88.22 71.3 37207344 105.1
    A 86.54 85.30 90.39 91.46 91.45 92.03 17.7 9152256 66.10
    B 87.33 85.89 90.94 92.52 92.35 92.75 71.3 37207344 105.1
    C 86.62 84.68 90.52 91.43 92.06 91.82 71.3 37207344 105.1
    D 86.69 85.29 90.74 94.65 90.53 93.55 17.7 9152256 66.10
    E 87.21 86.36 91.23 95.41 91.27 94.28 71.3 37207344 105.1
    ITD 88.15 86.94 92.02 96.33 93.17 94.65 17.7 9152256 66.10
    -YOLO
    下载: 导出CSV

    表  4   FLIR数据集算法对比

    Table  4   Comparison of algorithms for the FLIR dataset

    Algorithm Volume/(MB) mAP/(%) Inference time/ms
    YOLOv5s 13.7 66.67 5.8
    YOLOv6s 38.7 86.25 10.9
    YOLOv7s 71.3 89.97 13.3
    YOLOv8s 22.5 88.74 5.7
    ITD-YOLO 17.7 92.02 11.2
    下载: 导出CSV

    表  5   OSU数据集算法对比

    Table  5   Comparison of algorithms for OSU datasets

    Algorithm Volume/(MB) mAP/(%) Inference time/ms
    YOLOv5s 13.7 92.61 5.8
    YOLOv6s 38.7 90.84 10.9
    YOLOv7s 71.3 88.22 13.3
    YOLOv8s 22.5 93.12 5.7
    ITD-YOLO 17.7 94.65 11.2
    下载: 导出CSV
  • [1]

    Girshick R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448. DOI: 10.1109/ICCV.2015.169.

    [2]

    REN S. Faster r-CNN: towards real-time object detection with region proposal networks[J]. arxiv preprint arxiv: 1506.01497, 2015.

    [3]

    HE K, Gkioxari G, Dollár P, et al. Mask r- CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.

    [4]

    Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.

    [5]

    Redmon J, Farhadi A. Yolov3: an incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.

    [6]

    Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.

    [7]

    LIU W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Computer Vision–ECCV 2016, 2016: 21-37.

    [8]

    DUAN K W, BAI S, XIE L X, et al. Centernet: keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6569-6578.

    [9]

    TANG C W, LIU C L, CHIU P S. HRCenterNet: an anchorless approach to Chinese character segmentation in historical documents[C]//2020 IEEE International Conference on Big Data (Big Data), 2020: 1924-1930.

    [10]

    Vaswani A. Attention is all you need[J/OL]. Advances in Neural Information Processing Systems, 2017: 10.48550/arXiv.1706.03762

    [11] 王恒涛, 张上, 陈想, 等. 轻量化无人机航拍目标检测算法[J]. 电子测量技术, 2022, 45(19): 167-174.

    WANG Hengtai, ZHANG Shang, CHEN Xiang, et al. Lightweight target detection algorithm for drone aerial photography[J]. Electronic Measurement Technology, 2022, 45(19): 167-174.

    [12] 王恒涛, 张上. 轻量化SAR图像舰船目标检测算法[J]. 电光与控制, 2023, 30(5): 99-104, 110.

    WANG Hengtai, ZHANG Shang. Lightweight SAR image ship target detection algorithm[J]. Electro-Optics and Control, 2023, 30(5): 99-104, 110.

    [13] 黄磊, 杨媛, 杨成煜, 等. FS-YOLOv5: 轻量化红外目标检测方法[J]. 计算机工程与应用, 2023, 59(9): 215-224.

    HUANG Lei, YANG Yuan, YANG Chengyu, et al. FS-YOLOv5: lightweight infrared target detection method[J]. Computer Engineering and Applications, 2023, 59(9): 215-224.

    [14] 贺顺, 谢永妮, 杨志伟, 等. 基于IHBF的增强局部对比度红外小目标检测方法[J]. 红外技术, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1

    HE Shun, XIE Yongni, YANG Zhiwei, et al. Enhanced local contrast infrared small target detection method based on IHBF[J]. Infrared Technology, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1

    [15] 李飚, 徐智勇, 王琛, 等. 基于自适应梯度倒数滤波红外弱小目标场景背景抑制[J]. 光电工程, 2021, 48(8): 47-58.

    LI Biao, XU Zhiyong, WANG Chen, et al. Adaptive gradient reciprocal filtering for infrared dim and small target scene background suppression[J]. Opto-Electronic Engineering, 2021, 48(8): 47-58.

    [16] 李向荣, 孙立辉. 融合注意力机制的多尺度红外目标检测[J]. 红外技术, 2023, 45(7): 746-754. http://hwjs.nvir.cn/article/id/2e1d129d-a77a-4dba-8de5-135fb8b75ee7

    LI Xiangrong, SUN Lihui. Multi-scale infrared target detection with attention mechanism fusion[J]. Infrared Technology, 2023, 45(7): 746-754. http://hwjs.nvir.cn/article/id/2e1d129d-a77a-4dba-8de5-135fb8b75ee7

    [17]

    BAO C, CAO J, HAO Q, et al. Dual-YOLO architecture from infrared and visible images for object detection[J]. Sensors, 2023, 23(6): 2934. DOI: 10.3390/s23062934

    [18]

    LI L, JIANG L, ZHANG J, et al. A complete YOLO-based ship detection method for thermal infrared remote sensing images under complex backgrounds[J]. Remote Sensing, 2022, 14(7): 1534. DOI: 10.3390/rs14071534

    [19]

    HONG R, WANG X, FANG Y, et al. Yolo-light: remote straw-burning smoke detection based on depthwise separable convolution and channel attention mechanisms[J]. Applied Sciences, 2023, 13(9): 5690. DOI: 10.3390/app13095690

    [20] 李强龙, 周新文, 位梦恩, 等. 基于条形池化和注意力机制的街道场景红外目标检测算法[J]. 计算机工程, 2023, 49(8): 310-320.

    LI Qianglong, ZHOU Xinwen, WEI Meng'en, et al. Infrared target detection algorithm in street scene based on stripe pooling and attention mechanism[J]. Computer Engineering, 2023, 49(8): 310-320.

    [21] 李杨, 武连全, 杨海涛, 等. 一种无人机视角下的小目标检测算法[J]. 红外技术, 2023, 45(9): 925-931. http://hwjs.nvir.cn/article/id/96c0d27e-e9e1-49bf-b1b3-9a496e00f91f

    LI Yang, WU Lianquan, YANG Haitao, et al. A small target detection algorithm from drone perspective[J]. Infrared Technology, 2023, 45(9): 925-931. http://hwjs.nvir.cn/article/id/96c0d27e-e9e1-49bf-b1b3-9a496e00f91f

    [22]

    CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[J]. arXiv preprint arXiv: 2303.03667, 2023.

    [23]

    LENG Z, TAN M, LIU C, et al. Polyloss: a polynomial expansion perspective of classification loss functions[J]. arXiv preprint arXiv: 2204.12511, 2022.

    [24]

    Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[J]. arXiv preprint arXiv: 2205.12740, 2022.

  • 期刊类型引用(7)

    1. 漆云海,李绍楠,杜保林,张鹏,胡磊力. 红外双波段双层衍射定焦光学系统设计. 电光与控制. 2024(05): 108-111 . 百度学术
    2. 亓晨,靳阳明,谢晓喻,侯辉辉,李永生. 共光路红外双波段小型化光学镜头分析与设计. 红外与激光工程. 2024(04): 157-164 . 百度学术
    3. 田晓航,薛常喜. 小F数红外双波段无热化折衍摄远物镜设计. 光学学报. 2022(14): 181-187 . 百度学术
    4. 王振东,刘欢,陈阳,潘永强,谢万鹏,韩军. 基于谐衍射理论的0.40~2.50μm宽波段光学系统设计. 激光与光电子学进展. 2022(19): 320-326 . 百度学术
    5. 杨曼曼,冯斌,史元元,胥磊. 双层谐衍射红外消热差光学系统设计. 西安工业大学学报. 2020(02): 153-159+193 . 百度学术
    6. 刘尘尘,吴成茂. 基于嵌入式技术的超精密光学元件瑕疵检测研究. 激光杂志. 2020(09): 62-66 . 百度学术
    7. 杨洪涛,杨晓帆,梅超,陈卫宁. 折衍混合红外双波段变焦光学系统设计. 红外与激光工程. 2020(10): 96-103 . 百度学术

    其他类型引用(4)

图(8)  /  表(5)
计量
  • 文章访问数:  96
  • HTML全文浏览量:  34
  • PDF下载量:  36
  • 被引次数: 11
出版历程
  • 收稿日期:  2023-04-23
  • 修回日期:  2023-05-23
  • 刊出日期:  2024-12-19

目录

    /

    返回文章
    返回