基于改进YOLOv8复杂街道场景下的红外目标检测算法

洪俐, 曾祥进

洪俐, 曾祥进. 基于改进YOLOv8复杂街道场景下的红外目标检测算法[J]. 红外技术, 2025, 47(5): 591-600.
引用本文: 洪俐, 曾祥进. 基于改进YOLOv8复杂街道场景下的红外目标检测算法[J]. 红外技术, 2025, 47(5): 591-600.
HONG Li, ZENG Xiangjin. Infrared Target Detection Algorithm Based on Improved YOLOv8 in Complex Street Scenes[J]. Infrared Technology , 2025, 47(5): 591-600.
Citation: HONG Li, ZENG Xiangjin. Infrared Target Detection Algorithm Based on Improved YOLOv8 in Complex Street Scenes[J]. Infrared Technology , 2025, 47(5): 591-600.

基于改进YOLOv8复杂街道场景下的红外目标检测算法

基金项目: 

国家自然科学基金 61502354

湖北省湖北三峡实验室创新基金 SC215001

详细信息
    作者简介:

    洪俐(1998-),男,硕士研究生,研究方向为机器视觉。E-mail:1292286139@qq.com

    通讯作者:

    曾祥进(1977-),男,博士,副教授,硕士生导师。研究方向为智能机器人控制、机器视觉、运动控制。E-mail:xjzeng21@163.com

  • 中图分类号: TP391.4

Infrared Target Detection Algorithm Based on Improved YOLOv8 in Complex Street Scenes

  • 摘要:

    针对复杂街道背景下的红外图像因遮挡、缺乏纹理细节等因素而导致目标误检、漏检的问题,提出一种复杂街道场景下的红外目标检测算法。以YOLOv8n作为基线模型,首先,通过设计多分支卷积结构,以强化特征提取和特征表达,利用结构重参数化实现训练和推理阶段解耦,提高模型推理速度,同时引入全局自注意力估计来加快注意力的计算,将时间复杂度降为O(n),使得卷积核注意力实现动态同一。其次,结合深度可分离卷积和可变形卷积的优势,对上采样结果与主干网络的输出特征进行特征融合之后,引入显著信息感知的可变形卷积注意力门控机制,提高融合特征的语义信息丰富度。最后,替换定位损失函数为高效交并比,分别计算预测框和真实框的长、宽影响因子,加速收敛速度。在Flir数据集上进行验证实验,改进算法的平均精度均值达到79.5%,相较于YOLOv8n算法提高了3.9%,验证了所提算法在复杂街道背景下的红外目标检测上的优越性。

    Abstract:

    Aiming at the problem of target misdetection and missed detection in infrared images under complex street backgrounds due to factors such as occlusion and lack of texture details, this paper proposes an infrared target detection algorithm for complex street scenes. Using YOLOv8n as the baseline model, firstly, a multi branch convolutional structure is designed to enhance feature extraction and expression. Structural reparameterization is used to decouple the training and inference stages, improve the inference speed of the model, and global self attention estimation is introduced to accelerate the calculation of attention. The time complexity is reduced to O(n), enabling the convolutional kernel attention to achieve dynamic identity. Secondly, combining the advantages of depthwise separable convolution and deformable convolution, after feature fusion between the upsampling results and the output features of the backbone network, a salient information aware deformable convolution attention gating mechanism is introduced to improve the semantic information richness of the fused features. Finally, An efficient intersection and union ratio replace the localization loss function, calculate the length and width influence factors of the predicted box and the true box separately, and accelerate the convergence speed. Validation experiments were conducted on the Flir dataset, and the average accuracy of the improved algorithm reached 79.5%, which is 3.9% higher than the YOLOv8n algorithm. This validates the superiority of the proposed algorithm in infrared target detection under complex street backgrounds.

  • 图  1   YOLOv8网络结构

    Figure  1.   YOLOv8 network structure

    图  2   改进后的YOLOv8网络结构

    Figure  2.   Improved YOLOv8 network structure

    图  3   COSA流程处理

    Figure  3.   COSA process processing

    图  4   MBC-GSAE结构

    Figure  4.   MBC-GSAE structural diagram

    图  5   DAC结构

    Figure  5.   DAC structure

    图  6   原图、YOLOv8n以及改进YOLOv8n检测结果对比

    Figure  6.   Comparison of the original image, YOLOv8n and improved YOLOv8n detection results

    表  1   实验环境配置

    Table  1   Experimental environment configuration

    Name Environment Configuration
    Operating System Windows10
    CPU Intel 12400F
    GPU NVIDIA RTX 4070 12GB
    Framework Pytorch1.9.0 + CUDA12.2 +cuDNN8.9.6
    Languages Python3.9
    下载: 导出CSV

    表  2   各实验对比结果

    Table  2   Comparison of experimental results

    Models FLOPs/G Size/MB AP mAP(IoU=0.5)/% FPS
    Car/% Bicycle/% Person/%
    YOLOv5s 15.8 13.76 90.3 62.6 83.0 78.6 80.4
    YOLO-IDSTD[16] 3.0 7.36 83.1 44.8 72.4 66.8 -
    FEID-YOLO[23] - 20.62 76.5 36.6 58.7 57.3 -
    YOLOv7-tiny 13.0 11.72 90.1 61.5 83.8 78.5 108.2
    MSC-YOLO 5.9 4.63 89.2 62.3 83.1 78.2 96.3
    FS-YOLOv5s[24] - 10.72 89.1 59.2 81.5 76.6 -
    YOLOv8n 8.9 5.96 89.3 56.8 81.3 75.6 117.6
    IMPROVED-YOLOv8n 9.6 6.52 90.2 66.3 82.1 79.5 114.1
    下载: 导出CSV

    表  3   不同模型在VOC 2007数据集上的对比结果

    Table  3   Comparison results of different models on the VOC 2007 dataset

    Models Input image size Size/MB mAP(IoU=0.5)/% FPS
    DPM-v5[25] - - 32.1 0.7
    DPM-CF[26] - - 30.6 5.2
    Fastest-DPM[27] - - 30.4 28.6
    Faster R-CNN(VGG) 600*1000 462 81.5 13.5
    SSD(VGG) 512*512 105.8 77.2 49.5
    DSSD(ResNet101) 321*321 490.3 78.4 9.5
    FSSD(VGG) 300*300 - 78.6 68.5
    YOLOv5s 544*544 28.8 73.5 76.2
    YOLOv8n 512*640 5.96 76.8 104.3
    IMPROVED-YOLOv8n 512*640 6.52 79.4 100.7
    下载: 导出CSV

    表  4   消融实验

    Table  4   Ablation experiment

    Models MBC-GSAE DAC WIoU Car/% Bicycle/% Person/% mAP0.5/%
    YOLOv8-n 89.3 56.8 81.3 75.6
    89.6 61.7 81.6 77.6
    89.8 64.9 81.8 78.8
    90.2 66.3 82.1 79.5
    下载: 导出CSV
  • [1] 楼哲航, 罗素云. 基于YOLOX和Swin Transformer的车载红外目标检测[J]. 红外技术, 2022, 44(11): 1167-1175. http://hwjs.nvir.cn/article/id/3d31e429-9365-4797-ab65-60e06a4414d8

    LOU Zhehang, LUO Suyun. Vehicle infrared target detection based on YOLOX and swin transformer[J]. Infrared Technology, 2022, 44(11): 1167-1175. http://hwjs.nvir.cn/article/id/3d31e429-9365-4797-ab65-60e06a4414d8

    [2]

    DAI X, YUAN X, WEI X. TIRNet: Object detection in thermal infrared images for autonomous driving [J]. Applied Intelligence, 2020, 51(3): 1244-1261.

    [3] 易诗, 李欣荣, 吴志娟, 等. 基于红外热成像与改进YOLOV3的夜间野兔监测方法[J]. 农业工程学报, 2019, 35(19): 223-229.

    YI Shi, LI Xinrong, WU Zhijuan, et al. Night hare detection method based on infrared thermal imaging and improved YOLOV3[J]. Transactions of the Chinese Society of Agricultural Engineering. 2019, 35(19): 223-229.

    [4] 刘晓文, 曾雪婷, 李涛, 等. 基于改进YOLO v7的生猪群体体温热红外自动检测方法[J]. 农业机械学报, 2023, 54(S1): 267-274. DOI: 10.6041/j.issn.1000-1298.2023.S1.029

    LIU Xiaowen, ZENG Xueting, LI TAO, et al. Automatic detection method of body temperature in herd of pigs based on ilmproved YOLOv7[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(S1): 267-274. DOI: 10.6041/j.issn.1000-1298.2023.S1.029

    [5] 刘刚, 冯彦坤, 康熙. 基于改进YOLO v4的生猪耳根温度热红外视频检测方法[J]. 农业机械学报, 2023, 54(2): 240-248.

    LIU GANG, FENG Yankun, KANG XI. Detection method of pig ear root temperature based on improved YOLO v4[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(2): 240-248.

    [6]

    ZHANG H, LUO C, WANG Q, et al. A novel infrared video surveillance system using deep learning based techniques [J]. Multimedia Tools and Applications, 2018: 77(20): 26657-26676. DOI: 10.1007/s11042-018-5883-y

    [7]

    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

    [8]

    Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.

    [9]

    REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031

    [10]

    Redmon J, Divvala S, Girshick R, et al. You only look once: unified, realtime object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.

    [11]

    Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525.

    [12]

    Redmon J, Farhadi A. Yolov3: An incremental improvement[J/OL]. arXiv preprint arXiv: 1804.02767, https://arxiv.org/abs/1804.02767.

    [13]

    LIU W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Computer Vision–ECCV Proceedings, 2016: 21-37.

    [14]

    LIN T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.

    [15] 李强龙, 周新文, 位梦恩, 等. 基于条形池化和注意力机制的街道场景红外目标检测算法[J/OL]. 计算机工程: 1-13, [2023-05-20]. Doi: 10.19678/j.issn.1000-3428.0065481.

    LI Qianglong, ZHOU Xinwen, WEI Meng'en, et al. Infrared target detection algorithm based on strip pooling and attention mechanism in street scene[J/OL]. Computer Engineering: 1-13, [2023-05-20]. Doi: 10.19678/j.issn.1000-3428.0065481.

    [16] 蒋昕昊, 蔡伟, 杨志勇, 等. 基于YOLO-IDSTD算法的红外弱小目标检测[J]. 红外与激光工程, 2022, 51(3): 502-511.

    JIANG Xinhao, CAI Wei, YANG Zhiyong, et al. Infrared dim and small target detection based on YOLO-IDSTD algorithm[J]. Infrared and Laser Engineering, 2022, 51(3): 502-511.

    [17] 陈永麟, 王恒涛, 张上. 基于YOLO v7的轻量级红外目标检测算法[J]. 红外技术, 2024, 46(12): 1380-1389. http://hwjs.nvir.cn/article/id/e476d956-cfb7-4f3a-aafb-2e7b5e7a7890

    CHEN Yonglin, WANG Hengtao, ZHANG Shang. Lightweight infrared target detection algorithm based on YOLOv7[J]. Infrared Technology, 2024, 46(12): 1380-1389. http://hwjs.nvir.cn/article/id/e476d956-cfb7-4f3a-aafb-2e7b5e7a7890

    [18] 蔡伟, 徐佩伟, 杨志勇, 等. 复杂背景下红外图像弱小目标检测[J]. 应用光学, 2021, 42(4): 643-650.

    CAI Wei, XU Peiwei, YANG Zhiyong, et al. Dim-small targets detection of infrared images in complex backgrounds[J]. Journal of Applied Optics, 2021, 42(4): 643-650.

    [19]

    WU Haiping, XIAO Bin, Noel Codella, et al. CvT: Introducing convolutions to vision transformers[J/OL]. arXiv: 2103.15808, https://doi.org/10.48550/arXiv.2103.15808.

    [20]

    Irwan Bello, Barret Zoph, Quoc Le, et al. Attention augmented convolutional networks[C]// IEEE International Conference on Computer Vision, 2019: 3286-3295.

    [21]

    ZHANG H, Fromont E, Lefevre S, et al. Multispectral fusion for object detection with cyclic fuse-and-refine blocks[C]//IEEE International Conference on Image Processing, 2020: 276-280.

    [22] 邓姗姗, 黄慧, 马燕. 基于改进Faster R-CNN的小目标检测算法[J]. 计算机工程与科学, 2023, 45(5): 869-877. DOI: 10.3969/j.issn.1007-130X.2023.05.012

    DENG Shanshan, HUANG Hui, MA Yan. A small object detection algorithm based on improved Faster R-CNN[J]. Computer Engineering and Science, 2023, 45(5): 869-877. DOI: 10.3969/j.issn.1007-130X.2023.05.012

    [23] 郭勇, 张凯. 基于特征增强的快速红外目标检测[J]. 无线电工程, 2023, 53(1): 47-55.

    GUO Yong, ZHANG Kai. Fast infrared object detection based on feature enhancement[J]. Radio Engineering, 2023, 53(1): 47-55.

    [24] 黄磊, 杨媛, 杨成煜, 等. FS-YOLOv5: 轻量化红外目标检测方法[J]. 计算机工程与应用, 2023, 59(9): 215-224.

    HUANG Lei, YANG Yuan, YANG Chengyu, et al. FS-YOLOv5: lightweight infrared rode target detection method[J]. Computer Engineering and Applications, 2023, 59(9): 215-224.

    [25]

    Girshick R, Felzenszwalb P, FMcAllester D. Object Detection with Discriminatively Trained Part Based Models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645. DOI: 10.1109/TPAMI.2009.167

    [26]

    Pedersoli M, Vedaldi A, Gonz`alez J, et al. A coarse-to-fine approach for fast deformable object detection[J]. Pattern Recognition, 2015, 48(5): 1844-1853, .

    [27]

    YAN J, LEI Z, WEN L, et al. The fastest deformable part model for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014: 2497-2504.

  • 期刊类型引用(1)

    1. 费国标. 一种红外成像镜头结构设计及分析. 科学技术创新. 2020(15): 188-189 . 百度学术

    其他类型引用(0)

图(6)  /  表(4)
计量
  • 文章访问数:  118
  • HTML全文浏览量:  12
  • PDF下载量:  46
  • 被引次数: 1
出版历程
  • 收稿日期:  2023-12-27
  • 修回日期:  2024-01-23
  • 网络出版日期:  2025-05-27
  • 刊出日期:  2025-05-19

目录

    /

    返回文章
    返回
    x 关闭 永久关闭

    尊敬的专家、作者、读者:

    端午节期间因系统维护,《红外技术》网站(hwjs.nvir.cn)将于2024年6月7日20:00-6月10日关闭。关闭期间,您将暂时无法访问《红外技术》网站和登录投审稿系统,给您带来不便敬请谅解!

    预计6月11日正常恢复《红外技术》网站及投审稿系统的服务。您如有任何问题,可发送邮件至编辑部邮箱(irtek@china.com)与我们联系。

    感谢您对本刊的支持!

    《红外技术》编辑部

    2024年6月6日