基于DCS-YOLOv8模型的红外图像目标检测方法

沈凌云, 郎百和, 宋正勋, 温智滔

沈凌云, 郎百和, 宋正勋, 温智滔. 基于DCS-YOLOv8模型的红外图像目标检测方法[J]. 红外技术, 2024, 46(5): 565-575.
引用本文: 沈凌云, 郎百和, 宋正勋, 温智滔. 基于DCS-YOLOv8模型的红外图像目标检测方法[J]. 红外技术, 2024, 46(5): 565-575.
SHEN Lingyun, LANG Baihe, SONG Zhengxun, WEN Zhitao. Infrared Image Object Detection Method Based on DCS-YOLOv8 Model[J]. Infrared Technology , 2024, 46(5): 565-575.
Citation: SHEN Lingyun, LANG Baihe, SONG Zhengxun, WEN Zhitao. Infrared Image Object Detection Method Based on DCS-YOLOv8 Model[J]. Infrared Technology , 2024, 46(5): 565-575.

基于DCS-YOLOv8模型的红外图像目标检测方法

基金项目: 

山西省引进人才科技创新启动基金 21010123

山西省高等院校大学生创新项目 S202314101195

吉林省科技发展计划基金项目 YDZJ202102CXJD007

详细信息
    作者简介:

    沈凌云(1979-),女,工学博士,副教授,主要从事机器视觉与智能信息处理方向研究。E-mail:shenshly@163.com

  • 中图分类号: TP391

Infrared Image Object Detection Method Based on DCS-YOLOv8 Model

  • 摘要:

    针对低信噪比与复杂任务场景下,YOLOv8模型对红外遮挡目标和弱小目标检测能力不足的问题,提出了改进的DCS-YOLOv8模型(DCN_C2f-CA-SIoU-YOLOv8)的目标检测方法。以YOLOv8框架为基础,主干网络构建了基于可变形卷积的轻量级DCN_C2f(Deformable Convolution Network)模块,自适应调整网络的视觉感受野,提高目标多尺度特征表示能力。特征融合网络引入基于坐标注意力机制CA(Coordinate Attention)的模块,通过捕捉多目标空间位置依赖关系,提高目标的定位准确性。改进基于SIoU(Scylla IoU)的位置回归损失函数,实现预测框与真实框之间的相对位移方向匹配,加快模型收敛速度并提升检测与定位精度。实验结果表明,相较于YOLOv8-n\s\m\l\x系列模型,DCS-YOLOv8在FLIR、OTCBVS与VEDAI测试集上平均精度均值mAP@0.5平均提高了6.8%、0.6%、4.0%,分别达到86.5%、99.0%与75.6%。同时,模型的推理速度满足红外目标检测任务的实时性要求。

    Abstract:

    In response to the challenges posed by low signal-to-noise ratios and complex task scenarios, an improved detection method called DCS-YOLOv8 (DCN_C2f-CA-SIoU-YOLOv8) is proposed to address the insufficient infrared occluded object detection and weak target detection capabilities of the YOLOv8 model. Building on the YOLOv8 framework, the backbone network incorporates a lightweight deformable convolution network (DCN_C2f) module based on deformable convolutions, which adaptively adjusts the network's visual receptive field to enhance the multi-scale feature representation of objects. The feature fusion network introduces the coordinate attention (CA) module based on coordinate attention mechanisms to capture spatial dependencies among multiple objects, thereby improving the object localization accuracy. Additionally, the position regression loss function is enhanced using Scylla IoU to ensure a relative displacement direction match between the predicted and ground truth boxes. This improvement accelerates the model convergence speed and enhances the detection and localization accuracy. The experimental results demonstrate that DCS-YOLOv8 achieves significant improvements in the average precision of the FLIR, OTCBVS, and VEDAI test sets compared to the YOLOv8-n\s\m\l\x series models. Specifically, the average mAP@0.5 values are enhanced by 6.8%, 0.6%, and 4.0% respectively, reaching 86.5%, 99.0%, and 75.6%. Furthermore, the model's inference speed satisfies the real-time requirements for infrared object detection tasks.

  • 图  1   改进的YOLOv8算法网络结构

    Figure  1.   Network structure of improved YOLOv8 algorithm

    图  2   标准卷积与可变形卷积采样对比图

    Figure  2.   Sampling comparison between standard convolution and deformable convolution

    图  3   坐标注意力模块

    Figure  3.   Schematic diagram of coordinate attention module

    图  4   位置回归损失函数的成本计算

    Figure  4.   The scheme calculates the costs contribution in the position regression loss function

    图  5   位置回归损失函数的权重优化

    Figure  5.   Optimizing the weights of the position regression loss

    图  6   目标类别分布混淆矩阵图(FLIR)

    Figure  6.   Confusion matrix of object category distribution (FLIR)

    图  7   Precision-Recall曲线(FLIR)

    Figure  7.   Precision-Recall curves (FLIR)

    图  8   YOLOv8n与DCS-YOLOv8n在FLIR测试集的部分目标检测结果对比

    Figure  8.   Comparison of object detection results on the FLIR test set between YOLOv8n and DCS-YOLOv8n

    图  9   DCS-YOLOv8n在FLIR、OTCBVS与VEDAI数据集的目标检测结果标注

    Figure  9.   Annotated illustration of object detection results of DS-YOLOv8n on FLIR, OTCBVS, and VEDAI datasets

    表  1   模型训练超参数设置

    Table  1   Model training hyperparameter settings

    Hyperparameter options Setting
    Input Resolution 640×640
    Initial Learning Rate 0 (lr0) 0.01
    Learning Rate Float (lrf) 0.01
    Momentum 0.937
    Weight_Decay 0.0005
    Batch_Size 4
    Epochs 200
    下载: 导出CSV

    表  2   不同数据集上消融实验结果对比

    Table  2   Comparison of ablation experiment results on different datasets

    Models 1 Params/M GFLOPs Precision /% 2 Recall /% 2 mAP@0.5 /% 2
    B D C S D1 D2 D3 D1 D2 D3 D1 D2 D3
    3.2 8.2 74.5 94.1 73.2 68.6 90.0 43.5 77.2 97.6 60.5
    3.4 8.3 80.1 94.5 74.4 74.3 90.2 43.9 79.5 98.0 61.3
    3.2 8.2 80.0 94.4 80.1 73.1 93.3 49.6 78.0 97.9 62.8
    3.2 8.2 80.3 95.7 73.8 75.5 94.7 68.1 80.8 97.8 64.3
    3.4 8.3 80.5 94.3 71.7 75.2 93.3 69.8 80.5 98.2 67.6
    3.4 8.3 80.8 98.5 69.3 75.5 96.3 68.0 81.5 98.3 68.1
    3.2 8.2 81.2 99.5 69.5 75.6 95.4 72.1 82.0 98.0 70.5
    3.4 8.3 81.1 99.3 73.5 75.7 95.9 70.5 83.1 98.5 71.3
    1 B: Base(Yolov8n), D: DCN_C2f, C: CA, S: SIoU. 2 D1: FLIR, D2: OTCBVS, D3: VEDAI.
    下载: 导出CSV

    表  3   不同模型的目标检测实验结果

    Table  3   Results of different object detection model

    Models Params/M GFLOPs mAP@0.5/%1 Inference/(ms) 1
    D1 D2 D3 D1 D2 D3
    Faster R-CNN 15.8 28.3 71.1 87.8 52.4 30.4 102.3 63.1
    YOLOv3_tiny 8.7 13.0 74.2 90.5 58.1 12.6 37.1 21.3
    YOLOv5n 7.0 16.0 75.1 95.8 59.3 6.9 25.1 11.7
    YOLOv8n 3.2 8.2 77.2 97.6 67.5 7.1 23.7 9.9
    YOLOv8s 11.2 28.8 79.3 98.1 71.5 10.8 29.8 12.3
    YOLOv8m 25.9 79.1 81.5 98.5 72.6 20.5 41.0 15.2
    YOLOv8l 43.6 165.4 82.7 98.9 74.8 35.1 52.5 19.5
    YOLOv8x 68.2 258.1 84.5 99.1 76.9 47.5 70.6 27.1
    DCS-YOLOv8n 3.4 8.3 83.1 98.5 72.5 7.1 22.9 10.6
    DCS-YOLOv8s 11.3 29.2 85.2 98.9 73.8 10.9 28.7 13.1
    DCS-YOLOv8m 25.9 79.5 87.4 99.2 75.9 20.6 38.1 16.4
    DCS-YOLOv8l 43.8 165.8 88.1 99.3 77.2 35.3 50.4 21.0
    DCS-YOLOv8x 69.1 258.5 88.6 99.3 78.6 47.9 62.7 29.1
    1 D1: FLIR, D2: OTCBVS, D3: VEDAI.
    下载: 导出CSV
  • [1] 韩金辉, 魏艳涛, 彭真明, 等. 红外弱小目标检测方法综述[J]. 红外与激光工程, 2022, 51(4): 438-461. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202204050.htm

    HAN J H, WEI Y T, PENG Z M, et al. Infrared dim and small target detection: a review[J]. Infrared and Laser Engineering, 2022, 51(4): 438-461. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202204050.htm

    [2]

    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.

    [3]

    ZHAO M, LI W, LI L, et al. Single-frame infrared small-target detection: a survey[J]. IEEE Geoscience and Remote Sensing Magazine, 2022, 10(2): 87-119. DOI: 10.1109/MGRS.2022.3145502

    [4]

    Girshick R. Fast R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448.

    [5]

    Gavrilescu R, Zet C, Fosalau C, et al. Faster R-CNN: an approach to real-time object detection[C]//Proc of International Conference and Exposition on Electrical and Power Engineering, 2018: 165-168.

    [6]

    CAI Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6154-6162.

    [7]

    HE Kaiming, Gkioxari Georgia, Dollar Piotr, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826

    [8]

    WEI Liu, Dragomir Anguelov, Dumitru Erhan, et al. SSD: single shot multibox detector[J]. arXiv, 2015: 1512.02325.

    [9]

    Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788.

    [10]

    Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv, 2018: 1804.02767.

    [11]

    Krizhevsky A, Sutskever I, Hinton Ge. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. DOI: 10.1145/3065386

    [12] 高昂, 梁兴柱, 夏晨星, 等. 一种改进YOLOv8的密集行人检测算法[J]. 图学学报, 2023, 44(5): 890-898. https://www.cnki.com.cn/Article/CJFDTOTAL-GCTX202305005.htm

    GAO A, LIANG X Z, XIA C X, et al. A dense pedestrian detection algorithm with improved Yolov8[J]. Journal of Graphics, 2023, 44(5): 890-898. https://www.cnki.com.cn/Article/CJFDTOTAL-GCTX202305005.htm

    [13] 陈皋, 王卫华, 林丹丹. 基于无预训练卷积神经网络的红外车辆目标检测[J]. 红外技术, 2021, 43(4): 342-348. http://hwjs.nvir.cn/cn/article/id/8142853e-c38f-43ff-8915-4810e1948dc3?viewType=HTML

    CHEN G, WANG W H, LIN D D. Infrared vehicle target detection based on convolutional neural network without pre-training[J]. Infrared Technology, 2021, 43(4): 342-348. http://hwjs.nvir.cn/cn/article/id/8142853e-c38f-43ff-8915-4810e1948dc3?viewType=HTML

    [14] 周颖, 颜毓泽, 陈海永, 等. 基于改进YOLOv8的光伏电池缺陷检测[J]. 激光与光电子学进展, 2024, 61(8): 0812008. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ202408025.htm

    ZHOU Y, YAN Y Z, CHEN H Y et al. Defect detection of photovoltaic cells based on improved Yolov8[J]. Laser & Optoelectronics Progress, 2024, 61(8): 0812008. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ202408025.htm

    [15]

    HOU L, LU K, XUE J, et al. Cascade detector with feature fusion for arbitrary-oriented objects in remote sensing images[C]//IEEE International Conference on Multimedia and Expo, 2020: 1-6.

    [16]

    XU D, WU Y. FE-YOLO: A feature enhancement network for remote sensing target detection[J]. Remote Sensing, 2021, 13(7): 1311. DOI: 10.3390/rs13071311

    [17]

    LIU W, MA L, WANG J, et al. Detection of multiclass objects in optical remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 16(5): 791-795.

    [18]

    HU J, ZHI X, SHI T, et al. PAG-YOLO: a portable attention-guided YOLO network for small ship detection[J]. Remote Sensing, 2021, 13(16): 3059. DOI: 10.3390/rs13163059

    [19]

    CHEN L, SHI W, DENG D. Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13(4): 660. DOI: 10.3390/rs13040660

    [20]

    Gevorgyan Z. Siou Loss: More powerful learning for bounding box regression[J]. arXiv, 2022: 2205.12740.

    [21]

    XU Z, XU X, WANG L, et al. Deformable convnet with aspect ratio constrained NMS for object detection in remote sensing imagery[J]. Remote Sensing, 2017, 9(12): 1312-1331. DOI: 10.3390/rs9121312

    [22]

    LI C, LUO B, HONG H, et al. Object detection based on global-local saliency constraint in aerial images[J]. Remote Sensing, 2020, 12(9): 1435-1457. DOI: 10.3390/rs12091435

    [23]

    ZHENG Z, ZHONG Y F, MA A L, et al. HyNet: hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 166: 1-14. DOI: 10.1016/j.isprsjprs.2020.04.019

    [24] 王建军, 魏江, 梅少辉, 等. 面向遥感图像小目标检测的改进YOLOv3算法[J]. 计算机工程与应用, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm

    WANG J J, WEI J, MEI S H, et al. Improved Yolov3 for small object detection in remote sensing image[J]. Computer Engineering and Applications, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm

    [25] 张瑶, 潘志松. GP-YOLOX: 无预训练的轻量级红外目标检测模型[J]. 计算机技术与发展, 2022, 32(12): 165-172. https://www.cnki.com.cn/Article/CJFDTOTAL-WJFZ202212025.htm

    ZHANG Y, PAN Z S. GP-YOLOX: Light-weight infrared object detection model without pre-training[J]. Computer Technology and Development, 2022, 32(12): 165-172. https://www.cnki.com.cn/Article/CJFDTOTAL-WJFZ202212025.htm

    [26]

    DAI J, QI H, XIONG Y, et al. Deformable Convolutional Networks[C]//IEEE International Conference on Computer Vision (ICCV), 2017: 764-777.

    [27]

    DENG L, GONG Y, LU X, et al. Focus-enhanced scene text recognition with deformable convolutions[C]//Proceedings of the 5th International Conference on Computer and Communications, 2019: 1685-1689.

    [28]

    XI W, SUN L, SUN J. Upgrade your network in-place with deformable convolution[C]//Proceedings of the 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science, 2020: 239-242.

    [29]

    LIN T, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826

    [30]

    RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: A small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 32(1): 187-203.

图(9)  /  表(3)
计量
  • 文章访问数:  266
  • HTML全文浏览量:  159
  • PDF下载量:  103
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-29
  • 修回日期:  2023-09-27
  • 网络出版日期:  2024-05-23
  • 刊出日期:  2024-05-19

目录

    /

    返回文章
    返回
    x 关闭 永久关闭

    尊敬的专家、作者、读者:

    端午节期间因系统维护,《红外技术》网站(hwjs.nvir.cn)将于2024年6月7日20:00-6月10日关闭。关闭期间,您将暂时无法访问《红外技术》网站和登录投审稿系统,给您带来不便敬请谅解!

    预计6月11日正常恢复《红外技术》网站及投审稿系统的服务。您如有任何问题,可发送邮件至编辑部邮箱(irtek@china.com)与我们联系。

    感谢您对本刊的支持!

    《红外技术》编辑部

    2024年6月6日