基于YOLO v7的轻量级红外目标检测算法

陈永麟; 王恒涛; 张上

基于YOLO v7的轻量级红外目标检测算法

陈永麟^{1, 2,},
王恒涛^{1, 2},
张上^{1, 2, ,}

1.
湖北省建筑质量检测装备工程技术研究中心, 湖北宜昌 443002
2.
三峡大学计算机与信息学院, 湖北宜昌 443002

基金项目:

国家级大学生创新创业训练计划 202111075012

国家级大学生创新创业训练计划 202011075013

详细信息

作者简介:
陈永麟(1999-)，男，湖北荆门人，硕士研究生，研究方向为目标检测，E-mail: 1768859718@qq.com

通讯作者:
张上(1979-)，男，湖北宜昌人，副教授，工学博士，研究方向为物联网技术、计算机应用技术，E-mail: 3011408157@qq.com

中图分类号: TP391.4
计量
- 文章访问数: 96
- HTML全文浏览量: 34
- PDF下载量: 36
出版历程
- 收稿日期: 2023-04-23
- 修回日期: 2023-05-23
- 刊出日期: 2024-12-19

Lightweight Infrared Target Detection Algorithm Based on YOLO v7

CHEN Yonglin^{1, 2,},
WANG Hengtao^{1, 2},
ZHANG Shang^{1, 2, ,}

1.
Hubei Province Engineering Technology Research Center for Construction Quality Testing Equipments, China Three Gorges University, Yichang 443002, China
2.
College of Computer and Information, China Three Gorges University, Yichang 443002, China

摘要

摘要:
针对红外图像信噪比低、分辨率不佳、噪声与杂波多等检测难点。提出一种基于YOLOv7的轻量化红外图像目标检测算法ITD-YOLO。首先，ITD-YOLO算法重设计网络结构，对特征提取网络与特征融合网络架构重新调整。裁剪掉原网络中深层对应的大感受野，依据重构后网络特征图输出，对模型预设锚框进行调节。改变多尺度特征融合中的深层特征与浅层特征的关系，提高浅层网络提取的细节信息在融合中所占的权重，提高对较小目标的检测性能；然后，在ELAN模块中引入PConv替换掉常规卷积，进一步降低模型计算量。其次，将模型损失函数调整为PolyLoss以加速模型收敛，进一步加强对目标的检测性能；最后，使用SIoU作为边框损失函数，增强对目标的定位精度。实验结果表明，ITB-YOLO能够有效改善检测效果，在FLIR与OSU数据集上，相较于YOLOv7s的平均精度均值分别提高2.27%与7.29%。改进后得到的模型体积仅为17.7 MB，计算量下降37.11%。与主流算法进行对比，ITD-YOLO在各项指标均得到了一定程度的提高，能够满足红外目标实时检测任务。
- 目标检测 /
- 模型轻量化 /
- YOLOv7 /
- PConv /
- PolyLoss /
- SIoU
Abstract:
Aiming at the detection difficulties of infrared images such as low signal-to-noise ratio, poor resolution, and much noise and clutter. We propose a lightweight infrared image target detection algorithm ITD-YOLO based on YOLOv7. Firstly, the ITD-YOLO algorithm redesigns the network structure, and re-adjusts the architecture of the feature extraction network and the feature fusion network. Crop out the large receptive fields corresponding to the deep layers in the original network, and adjust the model preset anchor frames based on the output of the reconstructed network feature map. The relationship between deep and shallow features in multi-scale feature fusion is changed to increase the weight of the detail information extracted by the shallow network in the fusion to improve the detection performance of smaller targets; then, PConv is introduced into the ELAN module to replace the conventional convolution to further reduce the model computation. Next, the model loss function is adjusted to PolyLoss to accelerate the model convergence and further enhance the detection performance for targets; finally, SIoU is used as the edge loss function to enhance the localisation accuracy for targets. The experimental results show that ITB-YOLO can effectively improve the detection effect, and the mean average accuracy is increased by 2.27% and 7.29% compared with YOLOv7s on FLIR and OSU datasets, respectively. The volume of the model obtained after the improvement is only 17.7 MB, and the computation volume decreases by 37.11%. Comparing with the mainstream algorithms, ITD-YOLO has been improved to a certain extent in all the indexes, and can meet the real-time infrared target detection task.
- target detection /
- model lightweight /
- YOLOv7 /
- PConv /
- PolyLoss /
- SIoU

HTML全文

图 1 ELAN网络结构

Figure 1. ELAN network structure

下载: 全尺寸图片幻灯片

图 2 ELAN-W网络结构

Figure 2. ELAN-W network structure

下载: 全尺寸图片幻灯片

图 3 ITD-YOLO系统架构图

Figure 3. ITD-YOLO system architecture

下载: 全尺寸图片幻灯片

图 4 特征融合网络结构

Figure 4. Feature fusion network structure

下载: 全尺寸图片幻灯片

图 5 三种卷积结构对比

Figure 5. Comparison of 3 types of convolution structure

下载: 全尺寸图片幻灯片

图 6 ELAN-P网络结构

Figure 6. Structure of the ELAN-P network

下载: 全尺寸图片幻灯片

图 7 ITD -YOLO算法结构

Figure 7. Detail of the ITD-YOLO algorithm structure

下载: 全尺寸图片幻灯片

图 8 检测效果对比

Figure 8. Comparison of detection results

下载: 全尺寸图片幻灯片

表 1 重构CSPDarkNet结构

Table 1 Reconfiguration of the CSPDarkNet structure

Module	Parameters	Channel	Kernel size	Output
CBS	928	32	(3, 3)	640×640
CBS	18560	64	(3, 3)	320×320
CBS	36992	64	(3, 3)	320×320
CBS	73984	128	(3, 3)	160×160
ELAN-P	108800	256		80×80
MP-1	213760	256		80×80
ELAN-P	432640	512		80×80
MP-1	853504	512		40×40
ELAN-P	1725440	1024		40×40

下载: 导出CSV

表 2 锚定框分配表

Table 2 Table of anchor box assignments

Feature map	40×40	80×80
Receptive field	Medium	Small
FLIR	(15, 16)	(23, 61)
	(14, 31)	(54, 44)
	(30, 26)	(102, 86)
OSU	(30, 40)	(33, 44)
	(36, 47)	(37, 50)
	(39, 52)	(42, 55)

下载: 导出CSV

表 3 消融实验

Table 3 Ablation experiments

Algorithm	Reconstruction +PConv	PolyLoss	SIoU	FLIR			OSU			Volume/MB	Parameters	GFLOPs
Algorithm	Reconstruction +PConv	PolyLoss	SIoU	P/(%)	R/(%)	mAP/(%)	P/(%)	R/(%)	mAP/(%)	Volume/MB	Parameters	GFLOPs
YOLOv7s				86.10	84.12	89.97	89.49	91.01	88.22	71.3	37207344	105.1
A	√			86.54	85.30	90.39	91.46	91.45	92.03	17.7	9152256	66.10
B		√		87.33	85.89	90.94	92.52	92.35	92.75	71.3	37207344	105.1
C			√	86.62	84.68	90.52	91.43	92.06	91.82	71.3	37207344	105.1
D	√	√		86.69	85.29	90.74	94.65	90.53	93.55	17.7	9152256	66.10
E		√	√	87.21	86.36	91.23	95.41	91.27	94.28	71.3	37207344	105.1
ITD	√	√	√	88.15	86.94	92.02	96.33	93.17	94.65	17.7	9152256	66.10
-YOLO

下载: 导出CSV

表 4 FLIR数据集算法对比

Table 4 Comparison of algorithms for the FLIR dataset

Algorithm	Volume/(MB)	mAP/(%)	Inference time/ms
YOLOv5s	13.7	66.67	5.8
YOLOv6s	38.7	86.25	10.9
YOLOv7s	71.3	89.97	13.3
YOLOv8s	22.5	88.74	5.7
ITD-YOLO	17.7	92.02	11.2

下载: 导出CSV

表 5 OSU数据集算法对比

Table 5 Comparison of algorithms for OSU datasets

Algorithm	Volume/(MB)	mAP/(%)	Inference time/ms
YOLOv5s	13.7	92.61	5.8
YOLOv6s	38.7	90.84	10.9
YOLOv7s	71.3	88.22	13.3
YOLOv8s	22.5	93.12	5.7
ITD-YOLO	17.7	94.65	11.2

下载: 导出CSV

参考文献(24)

[1]	Girshick R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448. DOI: 10.1109/ICCV.2015.169.
[2]	REN S. Faster r-CNN: towards real-time object detection with region proposal networks[J]. arxiv preprint arxiv: 1506.01497, 2015.
[3]	HE K, Gkioxari G, Dollár P, et al. Mask r- CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2961-2969.
[4]	Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
[5]	Redmon J, Farhadi A. Yolov3: an incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
[6]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[7]	LIU W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Computer Vision–ECCV 2016, 2016: 21-37.
[8]	DUAN K W, BAI S, XIE L X, et al. Centernet: keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6569-6578.
[9]	TANG C W, LIU C L, CHIU P S. HRCenterNet: an anchorless approach to Chinese character segmentation in historical documents[C]//2020 IEEE International Conference on Big Data (Big Data), 2020: 1924-1930.
[10]	Vaswani A. Attention is all you need[J/OL]. Advances in Neural Information Processing Systems, 2017: 10.48550/arXiv.1706.03762
[11]	王恒涛, 张上, 陈想, 等. 轻量化无人机航拍目标检测算法[J]. 电子测量技术, 2022, 45(19): 167-174. WANG Hengtai, ZHANG Shang, CHEN Xiang, et al. Lightweight target detection algorithm for drone aerial photography[J]. Electronic Measurement Technology, 2022, 45(19): 167-174.
[12]	王恒涛, 张上. 轻量化SAR图像舰船目标检测算法[J]. 电光与控制, 2023, 30(5): 99-104, 110. WANG Hengtai, ZHANG Shang. Lightweight SAR image ship target detection algorithm[J]. Electro-Optics and Control, 2023, 30(5): 99-104, 110.
[13]	黄磊, 杨媛, 杨成煜, 等. FS-YOLOv5: 轻量化红外目标检测方法[J]. 计算机工程与应用, 2023, 59(9): 215-224. HUANG Lei, YANG Yuan, YANG Chengyu, et al. FS-YOLOv5: lightweight infrared target detection method[J]. Computer Engineering and Applications, 2023, 59(9): 215-224.
[14]	贺顺, 谢永妮, 杨志伟, 等. 基于IHBF的增强局部对比度红外小目标检测方法[J]. 红外技术, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1 HE Shun, XIE Yongni, YANG Zhiwei, et al. Enhanced local contrast infrared small target detection method based on IHBF[J]. Infrared Technology, 2022, 44(11): 1132-1138. http://hwjs.nvir.cn/article/id/0f2609dc-79df-467e-ac1d-4d5f888850d1
[15]	李飚, 徐智勇, 王琛, 等. 基于自适应梯度倒数滤波红外弱小目标场景背景抑制[J]. 光电工程, 2021, 48(8): 47-58. LI Biao, XU Zhiyong, WANG Chen, et al. Adaptive gradient reciprocal filtering for infrared dim and small target scene background suppression[J]. Opto-Electronic Engineering, 2021, 48(8): 47-58.
[16]	李向荣, 孙立辉. 融合注意力机制的多尺度红外目标检测[J]. 红外技术, 2023, 45(7): 746-754. http://hwjs.nvir.cn/article/id/2e1d129d-a77a-4dba-8de5-135fb8b75ee7 LI Xiangrong, SUN Lihui. Multi-scale infrared target detection with attention mechanism fusion[J]. Infrared Technology, 2023, 45(7): 746-754. http://hwjs.nvir.cn/article/id/2e1d129d-a77a-4dba-8de5-135fb8b75ee7
[17]	BAO C, CAO J, HAO Q, et al. Dual-YOLO architecture from infrared and visible images for object detection[J]. Sensors, 2023, 23(6): 2934. DOI: 10.3390/s23062934
[18]	LI L, JIANG L, ZHANG J, et al. A complete YOLO-based ship detection method for thermal infrared remote sensing images under complex backgrounds[J]. Remote Sensing, 2022, 14(7): 1534. DOI: 10.3390/rs14071534
[19]	HONG R, WANG X, FANG Y, et al. Yolo-light: remote straw-burning smoke detection based on depthwise separable convolution and channel attention mechanisms[J]. Applied Sciences, 2023, 13(9): 5690. DOI: 10.3390/app13095690
[20]	李强龙, 周新文, 位梦恩, 等. 基于条形池化和注意力机制的街道场景红外目标检测算法[J]. 计算机工程, 2023, 49(8): 310-320. LI Qianglong, ZHOU Xinwen, WEI Meng'en, et al. Infrared target detection algorithm in street scene based on stripe pooling and attention mechanism[J]. Computer Engineering, 2023, 49(8): 310-320.
[21]	李杨, 武连全, 杨海涛, 等. 一种无人机视角下的小目标检测算法[J]. 红外技术, 2023, 45(9): 925-931. http://hwjs.nvir.cn/article/id/96c0d27e-e9e1-49bf-b1b3-9a496e00f91f LI Yang, WU Lianquan, YANG Haitao, et al. A small target detection algorithm from drone perspective[J]. Infrared Technology, 2023, 45(9): 925-931. http://hwjs.nvir.cn/article/id/96c0d27e-e9e1-49bf-b1b3-9a496e00f91f
[22]	CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[J]. arXiv preprint arXiv: 2303.03667, 2023.
[23]	LENG Z, TAN M, LIU C, et al. Polyloss: a polynomial expansion perspective of classification loss functions[J]. arXiv preprint arXiv: 2204.12511, 2022.
[24]	Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[J]. arXiv preprint arXiv: 2205.12740, 2022.

施引文献(11)

期刊类型引用(7)

1.	漆云海，李绍楠，杜保林，张鹏，胡磊力. 红外双波段双层衍射定焦光学系统设计. 电光与控制. 2024(05): 108-111 . 百度学术
2.	亓晨，靳阳明，谢晓喻，侯辉辉，李永生. 共光路红外双波段小型化光学镜头分析与设计. 红外与激光工程. 2024(04): 157-164 . 百度学术
3.	田晓航，薛常喜. 小F数红外双波段无热化折衍摄远物镜设计. 光学学报. 2022(14): 181-187 . 百度学术
4.	王振东，刘欢，陈阳，潘永强，谢万鹏，韩军. 基于谐衍射理论的0.40～2.50μm宽波段光学系统设计. 激光与光电子学进展. 2022(19): 320-326 . 百度学术
5.	杨曼曼，冯斌，史元元，胥磊. 双层谐衍射红外消热差光学系统设计. 西安工业大学学报. 2020(02): 153-159+193 . 百度学术
6.	刘尘尘，吴成茂. 基于嵌入式技术的超精密光学元件瑕疵检测研究. 激光杂志. 2020(09): 62-66 . 百度学术
7.	杨洪涛，杨晓帆，梅超，陈卫宁. 折衍混合红外双波段变焦光学系统设计. 红外与激光工程. 2020(10): 96-103 . 百度学术