基于DCS-YOLOv8模型的红外图像目标检测方法

沈凌云; 郎百和; 宋正勋; 温智滔

基于DCS-YOLOv8模型的红外图像目标检测方法

1.
太原工业学院电子工程系, 山西太原 030008
2.
长春理工大学电子信息工程学院, 吉林长春 130022
3.
长春理工大学教育部学科创新引智基地(D17017), 吉林长春 130022

基金项目:

山西省引进人才科技创新启动基金 21010123

山西省高等院校大学生创新项目 S202314101195

吉林省科技发展计划基金项目 YDZJ202102CXJD007

详细信息

作者简介:
沈凌云（1979-），女，工学博士，副教授，主要从事机器视觉与智能信息处理方向研究。E-mail：shenshly@163.com

中图分类号: TP391
计量
- 文章访问数: 266
- HTML全文浏览量: 159
- PDF下载量: 103
出版历程
- 收稿日期: 2023-08-29
- 修回日期: 2023-09-27
- 网络出版日期: 2024-05-23
- 刊出日期: 2024-05-19

Infrared Image Object Detection Method Based on DCS-YOLOv8 Model

1.
Department of Electronic Engineering, Taiyuan Institute of Technology, Taiyuan 030008, China
2.
Sch. of Elec. and Info. Engineering, Changchun University of Science and Technology, Changchun 130022, China
3.
Overseas Expertise Introduction Project for Discipline Innovation D17017, Changchun 130022, China

摘要

摘要:
针对低信噪比与复杂任务场景下，YOLOv8模型对红外遮挡目标和弱小目标检测能力不足的问题，提出了改进的DCS-YOLOv8模型（DCN_C2f-CA-SIoU-YOLOv8）的目标检测方法。以YOLOv8框架为基础，主干网络构建了基于可变形卷积的轻量级DCN_C2f（Deformable Convolution Network）模块，自适应调整网络的视觉感受野，提高目标多尺度特征表示能力。特征融合网络引入基于坐标注意力机制CA（Coordinate Attention）的模块，通过捕捉多目标空间位置依赖关系，提高目标的定位准确性。改进基于SIoU（Scylla IoU）的位置回归损失函数，实现预测框与真实框之间的相对位移方向匹配，加快模型收敛速度并提升检测与定位精度。实验结果表明，相较于YOLOv8-n\s\m\l\x系列模型，DCS-YOLOv8在FLIR、OTCBVS与VEDAI测试集上平均精度均值mAP@0.5平均提高了6.8%、0.6%、4.0%，分别达到86.5%、99.0%与75.6%。同时，模型的推理速度满足红外目标检测任务的实时性要求。
- 红外图像 /
- 目标检测 /
- 注意力机制 /
- 可变形卷积 /
- 多尺度特征
Abstract:
In response to the challenges posed by low signal-to-noise ratios and complex task scenarios, an improved detection method called DCS-YOLOv8 (DCN_C2f-CA-SIoU-YOLOv8) is proposed to address the insufficient infrared occluded object detection and weak target detection capabilities of the YOLOv8 model. Building on the YOLOv8 framework, the backbone network incorporates a lightweight deformable convolution network (DCN_C2f) module based on deformable convolutions, which adaptively adjusts the network's visual receptive field to enhance the multi-scale feature representation of objects. The feature fusion network introduces the coordinate attention (CA) module based on coordinate attention mechanisms to capture spatial dependencies among multiple objects, thereby improving the object localization accuracy. Additionally, the position regression loss function is enhanced using Scylla IoU to ensure a relative displacement direction match between the predicted and ground truth boxes. This improvement accelerates the model convergence speed and enhances the detection and localization accuracy. The experimental results demonstrate that DCS-YOLOv8 achieves significant improvements in the average precision of the FLIR, OTCBVS, and VEDAI test sets compared to the YOLOv8-n\s\m\l\x series models. Specifically, the average mAP@0.5 values are enhanced by 6.8%, 0.6%, and 4.0% respectively, reaching 86.5%, 99.0%, and 75.6%. Furthermore, the model's inference speed satisfies the real-time requirements for infrared object detection tasks.
- infrared images /
- object detection /
- attention mechanism /
- deformable convolution /
- multi-scale features

HTML全文

图 1 改进的YOLOv8算法网络结构

Figure 1. Network structure of improved YOLOv8 algorithm

下载: 全尺寸图片幻灯片

图 2 标准卷积与可变形卷积采样对比图

Figure 2. Sampling comparison between standard convolution and deformable convolution

下载: 全尺寸图片幻灯片

图 3 坐标注意力模块

Figure 3. Schematic diagram of coordinate attention module

下载: 全尺寸图片幻灯片

图 4 位置回归损失函数的成本计算

Figure 4. The scheme calculates the costs contribution in the position regression loss function

下载: 全尺寸图片幻灯片

图 5 位置回归损失函数的权重优化

Figure 5. Optimizing the weights of the position regression loss

下载: 全尺寸图片幻灯片

图 6 目标类别分布混淆矩阵图（FLIR）

Figure 6. Confusion matrix of object category distribution (FLIR)

下载: 全尺寸图片幻灯片

图 7 Precision-Recall曲线（FLIR）

Figure 7. Precision-Recall curves (FLIR)

下载: 全尺寸图片幻灯片

图 8 YOLOv8n与DCS-YOLOv8n在FLIR测试集的部分目标检测结果对比

Figure 8. Comparison of object detection results on the FLIR test set between YOLOv8n and DCS-YOLOv8n

下载: 全尺寸图片幻灯片

图 9 DCS-YOLOv8n在FLIR、OTCBVS与VEDAI数据集的目标检测结果标注

Figure 9. Annotated illustration of object detection results of DS-YOLOv8n on FLIR, OTCBVS, and VEDAI datasets

下载: 全尺寸图片幻灯片

表 1 模型训练超参数设置

Table 1 Model training hyperparameter settings

Hyperparameter options	Setting
Input Resolution	640×640
Initial Learning Rate 0 (lr0)	0.01
Learning Rate Float (lrf)	0.01
Momentum	0.937
Weight_Decay	0.0005
Batch_Size	4
Epochs	200

下载: 导出CSV

表 2 不同数据集上消融实验结果对比

Table 2 Comparison of ablation experiment results on different datasets

Models ¹				Params/M	GFLOPs	Precision /% ²			Recall /% ²			mAP@0.5 /% ²
B	D	C	S			D1	D2	D3	D1	D2	D3	D1	D2	D3
√				3.2	8.2	74.5	94.1	73.2	68.6	90.0	43.5	77.2	97.6	60.5
√	√			3.4	8.3	80.1	94.5	74.4	74.3	90.2	43.9	79.5	98.0	61.3
√		√		3.2	8.2	80.0	94.4	80.1	73.1	93.3	49.6	78.0	97.9	62.8
√			√	3.2	8.2	80.3	95.7	73.8	75.5	94.7	68.1	80.8	97.8	64.3
√	√	√		3.4	8.3	80.5	94.3	71.7	75.2	93.3	69.8	80.5	98.2	67.6
√	√		√	3.4	8.3	80.8	98.5	69.3	75.5	96.3	68.0	81.5	98.3	68.1
√		√	√	3.2	8.2	81.2	99.5	69.5	75.6	95.4	72.1	82.0	98.0	70.5
√	√	√	√	3.4	8.3	81.1	99.3	73.5	75.7	95.9	70.5	83.1	98.5	71.3
¹ B: Base(Yolov8n), D: DCN_C2f, C: CA, S: SIoU. ² D1: FLIR, D2: OTCBVS, D3: VEDAI.

下载: 导出CSV

表 3 不同模型的目标检测实验结果

Table 3 Results of different object detection model

Models	Params/M	GFLOPs	mAP@0.5/%¹			Inference/(ms) ¹
Models	Params/M	GFLOPs	D1	D2	D3	D1	D2	D3
Faster R-CNN	15.8	28.3	71.1	87.8	52.4	30.4	102.3	63.1
YOLOv3_tiny	8.7	13.0	74.2	90.5	58.1	12.6	37.1	21.3
YOLOv5n	7.0	16.0	75.1	95.8	59.3	6.9	25.1	11.7
YOLOv8n	3.2	8.2	77.2	97.6	67.5	7.1	23.7	9.9
YOLOv8s	11.2	28.8	79.3	98.1	71.5	10.8	29.8	12.3
YOLOv8m	25.9	79.1	81.5	98.5	72.6	20.5	41.0	15.2
YOLOv8l	43.6	165.4	82.7	98.9	74.8	35.1	52.5	19.5
YOLOv8x	68.2	258.1	84.5	99.1	76.9	47.5	70.6	27.1
DCS-YOLOv8n	3.4	8.3	83.1	98.5	72.5	7.1	22.9	10.6
DCS-YOLOv8s	11.3	29.2	85.2	98.9	73.8	10.9	28.7	13.1
DCS-YOLOv8m	25.9	79.5	87.4	99.2	75.9	20.6	38.1	16.4
DCS-YOLOv8l	43.8	165.8	88.1	99.3	77.2	35.3	50.4	21.0
DCS-YOLOv8x	69.1	258.5	88.6	99.3	78.6	47.9	62.7	29.1
¹ D1: FLIR, D2: OTCBVS, D3: VEDAI.

下载: 导出CSV

参考文献(30)

[1]	韩金辉, 魏艳涛, 彭真明, 等. 红外弱小目标检测方法综述[J]. 红外与激光工程, 2022, 51(4): 438-461. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202204050.htm HAN J H, WEI Y T, PENG Z M, et al. Infrared dim and small target detection: a review[J]. Infrared and Laser Engineering, 2022, 51(4): 438-461. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202204050.htm
[2]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[3]	ZHAO M, LI W, LI L, et al. Single-frame infrared small-target detection: a survey[J]. IEEE Geoscience and Remote Sensing Magazine, 2022, 10(2): 87-119. DOI: 10.1109/MGRS.2022.3145502
[4]	Girshick R. Fast R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448.
[5]	Gavrilescu R, Zet C, Fosalau C, et al. Faster R-CNN: an approach to real-time object detection[C]//Proc of International Conference and Exposition on Electrical and Power Engineering, 2018: 165-168.
[6]	CAI Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6154-6162.
[7]	HE Kaiming, Gkioxari Georgia, Dollar Piotr, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826
[8]	WEI Liu, Dragomir Anguelov, Dumitru Erhan, et al. SSD: single shot multibox detector[J]. arXiv, 2015: 1512.02325.
[9]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788.
[10]	Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv, 2018: 1804.02767.
[11]	Krizhevsky A, Sutskever I, Hinton Ge. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. DOI: 10.1145/3065386
[12]	高昂, 梁兴柱, 夏晨星, 等. 一种改进YOLOv8的密集行人检测算法[J]. 图学学报, 2023, 44(5): 890-898. https://www.cnki.com.cn/Article/CJFDTOTAL-GCTX202305005.htm GAO A, LIANG X Z, XIA C X, et al. A dense pedestrian detection algorithm with improved Yolov8[J]. Journal of Graphics, 2023, 44(5): 890-898. https://www.cnki.com.cn/Article/CJFDTOTAL-GCTX202305005.htm
[13]	陈皋, 王卫华, 林丹丹. 基于无预训练卷积神经网络的红外车辆目标检测[J]. 红外技术, 2021, 43(4): 342-348. http://hwjs.nvir.cn/cn/article/id/8142853e-c38f-43ff-8915-4810e1948dc3?viewType=HTML CHEN G, WANG W H, LIN D D. Infrared vehicle target detection based on convolutional neural network without pre-training[J]. Infrared Technology, 2021, 43(4): 342-348. http://hwjs.nvir.cn/cn/article/id/8142853e-c38f-43ff-8915-4810e1948dc3?viewType=HTML
[14]	周颖, 颜毓泽, 陈海永, 等. 基于改进YOLOv8的光伏电池缺陷检测[J]. 激光与光电子学进展, 2024, 61(8): 0812008. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ202408025.htm ZHOU Y, YAN Y Z, CHEN H Y et al. Defect detection of photovoltaic cells based on improved Yolov8[J]. Laser & Optoelectronics Progress, 2024, 61(8): 0812008. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ202408025.htm
[15]	HOU L, LU K, XUE J, et al. Cascade detector with feature fusion for arbitrary-oriented objects in remote sensing images[C]//IEEE International Conference on Multimedia and Expo, 2020: 1-6.
[16]	XU D, WU Y. FE-YOLO: A feature enhancement network for remote sensing target detection[J]. Remote Sensing, 2021, 13(7): 1311. DOI: 10.3390/rs13071311
[17]	LIU W, MA L, WANG J, et al. Detection of multiclass objects in optical remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 16(5): 791-795.
[18]	HU J, ZHI X, SHI T, et al. PAG-YOLO: a portable attention-guided YOLO network for small ship detection[J]. Remote Sensing, 2021, 13(16): 3059. DOI: 10.3390/rs13163059
[19]	CHEN L, SHI W, DENG D. Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13(4): 660. DOI: 10.3390/rs13040660
[20]	Gevorgyan Z. Siou Loss: More powerful learning for bounding box regression[J]. arXiv, 2022: 2205.12740.
[21]	XU Z, XU X, WANG L, et al. Deformable convnet with aspect ratio constrained NMS for object detection in remote sensing imagery[J]. Remote Sensing, 2017, 9(12): 1312-1331. DOI: 10.3390/rs9121312
[22]	LI C, LUO B, HONG H, et al. Object detection based on global-local saliency constraint in aerial images[J]. Remote Sensing, 2020, 12(9): 1435-1457. DOI: 10.3390/rs12091435
[23]	ZHENG Z, ZHONG Y F, MA A L, et al. HyNet: hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 166: 1-14. DOI: 10.1016/j.isprsjprs.2020.04.019
[24]	王建军, 魏江, 梅少辉, 等. 面向遥感图像小目标检测的改进YOLOv3算法[J]. 计算机工程与应用, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm WANG J J, WEI J, MEI S H, et al. Improved Yolov3 for small object detection in remote sensing image[J]. Computer Engineering and Applications, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm
[25]	张瑶, 潘志松. GP-YOLOX: 无预训练的轻量级红外目标检测模型[J]. 计算机技术与发展, 2022, 32(12): 165-172. https://www.cnki.com.cn/Article/CJFDTOTAL-WJFZ202212025.htm ZHANG Y, PAN Z S. GP-YOLOX: Light-weight infrared object detection model without pre-training[J]. Computer Technology and Development, 2022, 32(12): 165-172. https://www.cnki.com.cn/Article/CJFDTOTAL-WJFZ202212025.htm
[26]	DAI J, QI H, XIONG Y, et al. Deformable Convolutional Networks[C]//IEEE International Conference on Computer Vision (ICCV), 2017: 764-777.
[27]	DENG L, GONG Y, LU X, et al. Focus-enhanced scene text recognition with deformable convolutions[C]//Proceedings of the 5th International Conference on Computer and Communications, 2019: 1685-1689.
[28]	XI W, SUN L, SUN J. Upgrade your network in-place with deformable convolution[C]//Proceedings of the 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science, 2020: 239-242.
[29]	LIN T, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826
[30]	RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: A small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 32(1): 187-203.

施引文献

资源附件(0)

图(9) / 表(3)

计量

文章访问数: 266
HTML全文浏览量: 159
PDF下载量: 103
被引次数: 0

基于DCS-YOLOv8模型的红外图像目标检测方法

作者简介: 沈凌云（1979-），女，工学博士，副教授，主要从事机器视觉与智能信息处理方向研究。E-mail：shenshly@163.com

计量

出版历程

Infrared Image Object Detection Method Based on DCS-YOLOv8 Model

计量

出版历程

目录

作者简介:
沈凌云（1979-），女，工学博士，副教授，主要从事机器视觉与智能信息处理方向研究。E-mail：shenshly@163.com