基于红外与可见光图像的目标检测算法

邝楚文; 何望

基于红外与可见光图像的目标检测算法

邝楚文^1,,
何望²

1.
惠州经济职业技术学院, 广东惠州 516057
2.
华中科技大学计算机科学与技术学院, 湖北武汉 430074

基金项目:

国家自然科学基金项目 61972169

详细信息

作者简介:
邝楚文（1984-），男，汉族，广东珠海人，讲师，研究方向：计算机科学与技术，人工智能。E-mail: 1952707159@qq.com

中图分类号: TP391.41
计量
- 文章访问数: 526
- HTML全文浏览量: 105
- PDF下载量: 107
- 被引次数: 0
出版历程
- 收稿日期: 2021-11-29
- 修回日期: 2022-01-28
- 刊出日期: 2022-09-20

Object Detection Algorithm Based on Infrared and Visible Light Images

KUANG Chuwen^1
,,
HE Wang²

1.
Huizhou Economics and Polytechnic College, Huizhou 516057, China
2.
School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan 430074, China

摘要

摘要: 针对现有基于可见光的目标检测算法存在的不足，提出了一种红外和可见光图像融合的目标检测方法。该方法将深度可分离卷积与残差结构相结合，构建并列的高效率特征提取网络，分别提取红外和可见光图像目标信息；同时，引入自适应特征融合模块以自主学习的方式融合两支路对应尺度的特征，使两类图像信息互补；最后，利用特征金字塔结构将深层特征逐层与浅层融合，提升网络对不同尺度目标的检测精度。实验结果表明，所提网络能够充分融合红外和可见光图像中的有效信息，并在保障精度与效率的前提下实现目标识别与定位；同时，在实际变电站设备检测场景中，该网络也体现出较好的鲁棒性和泛化能力，可以高效完成检测任务。
- 目标检测 /
- 红外与可见光图像 /
- 深度学习 /
- 自适应融合
Abstract: A target detection method based on infrared and visible image fusion is proposed to overcome the shortcomings of the existing target detection algorithms based on visible light. In this method, depth separable convolution and the residual structure are combined to construct a parallel high-efficiency feature extraction network to extract the object information of infrared and visible images, respectively. Simultaneously, the adaptive feature fusion module is introduced to fuse the features of the corresponding scales of the two branches through autonomous learning such that the two types of image information are complementary. Finally, the deep and shallow features are fused layer by layer using the feature pyramid structure to improve the detection accuracy of different scale targets. Experimental results show that the proposed network can completely integrate the effective information in infrared and optical images and realize target recognition and location on the premise of ensuring accuracy and efficiency. Moreover, in the actual substation equipment detection scene, the network shows good robustness and generalization ability and can efficiently complete the detection task.
- object detection /
- infrared and visible light image /
- deep learning /
- adaptive fusion

HTML全文

图 1 双支路自适应目标检测网络整体结构

Figure 1. The overall structure of the dual-branch adaptive object detection network

下载: 全尺寸图片幻灯片

图 2 特征提取子模块

Figure 2. Feature extraction submodules

下载: 全尺寸图片幻灯片

图 3 特征融合模块

Figure 3. Feature fusion modules

下载: 全尺寸图片幻灯片

图 4 金字塔检测结构

Figure 4. Pyramid detection structure

下载: 全尺寸图片幻灯片

图 5 单支路与融合支路目标检测结果

Figure 5. Single branch and fusion branch object detection results

下载: 全尺寸图片幻灯片

图 6 红外-可见光网络检测效果对比

Figure 6. Comparison of infrared-visible network detection effects

下载: 全尺寸图片幻灯片

图 7 变电站设备检测效果对比

Figure 7. Comparison of substation equipment detection effects

下载: 全尺寸图片幻灯片

表 1 特征提取结构

Table 1. Feature extraction structure

Stage	Layer structure	Repetitions	Output size
Original	RGB, 3	1	448×448
Init	Conv 3×3, 13 Max pooling 2×2, 3	1	224×224
Stage1	DWconv 3×3, 32 Residual	1	112×112
Stage2	DWconv 3×3, 64 Residual	2	56×56
Stage3	DWconv 3×3, 128 Residual	4	28×28
Stage4	DWconv 3×3, 256 Residual	4	14×14
Stage5	DWconv 3×3, 512 Residual	2	7×7

下载: 导出CSV

表 2 可见光网络测试结果对比

Table 2. Comparison of visible network test results

Network	FPS	Test accuracy/(%)
Network	FPS	mAP	mAP_s	mAP_m	mAP_l
Faster RCNN^[13]	25	71.9	52.2	73.2	82.6
YOLO^[14]	66	67.5	49.1	70.6	78.9
Shuffle+SSD^[15]	113	65.6	46.1	68.3	75.3
Visible-light branch	93	67.3	48.5	70.1	77.6

下载: 导出CSV

表 3 不同结构测试结果

Table 3. Different dilation rates test results

Network	FPS	Test accuracy /(%)
Network	FPS	mAP	mAP_s	mAP_m	mAP_l
Infrared branch	94	61.0	45.3	64.2	71.8
Visible-light branch	93	67.3	48.5	70.1	77.6
Eltwise Fusion	81	70.4	51.1	73.8	80.1
Concat Fusion	79	71.6	52.3	74.2	81.6
This paper	78	73.8	54.3	75.1	83.2

下载: 导出CSV

表 4 同类型网络测试结果对比

Table 4. Comparison of test results of the same type of network

Network	FPS	Test accuracy/(%)
Network	FPS	AP	AP_s	AP_m	AP_l
Literature[7]	83	68.1	49.5	71.0	76.3
Literature [8]	51	74.0	54.0	76.5	84.1
Literature [19]	73	72.2	51.2	74.6	81.5
This paper	78	73.8	54.3	75.1	83.2

下载: 导出CSV

表 5 可见光网络测试结果对比

Table 5. Comparison of visible network test results

Network	FPS	Test accuracy /(%)
Network	FPS	AP	AP_s	AP_m	AP_l
Literature[7]	21	73.2	54.1	76.6	81.4
Literature [8]	13	78.2	58.9	81.4	88.7
Literature [19]	17	76.9	56.2	78.3	86.6
This paper	20	78.1	59.1	80.8	88.4

下载: 导出CSV

参考文献(19)

[1]	孙怡峰, 吴疆, 黄严严, 等. 一种视频监控中基于航迹的运动小目标检测算法[J]. 电子与信息学报, 2019, 41(11): 2744-2751. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201911028.htm SUN Yifeng, WU Jiang, HUANG Yan, et al. A track based moving small target detection algorithm in video surveillance [J]. Journal of Electronics and Information, 2019, 41(11): 2744-2751. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201911028.htm
[2]	LIN C, LU J, GANG W, et al. Graininess-aware deep feature learning for pedestrian detection[J]. IEEE Transactions on Image Processing, 2020, 29: 3820-3834. doi: 10.1109/TIP.2020.2966371
[3]	范丽丽, 赵宏伟, 赵浩宇, 等. 基于深度卷积神经网络的目标检测研究综述[J]. 光学精密工程, 2020, 28(5): 161-173. https://www.cnki.com.cn/Article/CJFDTOTAL-GXJM202005019.htm FAN Lili, ZHAO Hongwei, ZHAO Haoyu, et al. Overview of target detection based on deep convolution neural network[J]. Optical Precision Engineering, 2020, 28(5): 161-173. https://www.cnki.com.cn/Article/CJFDTOTAL-GXJM202005019.htm
[4]	赵永强, 饶元, 董世鹏, 等. 深度学习目标检测方法综述[J]. 中国图象图形学报, 2020, 288(4): 5-30. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202004001.htm ZHAO Yongqiang, RAO yuan, DONG Shipeng, et al. Overview of deep learning target detection methods[J]. Chinese Journal of Image and Graphics, 2020, 288(4): 5-30. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202004001.htm
[5]	罗会兰, 彭珊, 陈鸿坤. 目标检测难点问题最新研究进展综述[J]. 计算机工程与应用, 2021, 57(5): 36-46. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202105005.htm LUO Huilan, PENG Shan, CHEN Hongkun. Overview of the latest research progress on difficult problems of target detection[J]. Computer Engineering and Application, 2021, 57(5): 36-46. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202105005.htm
[6]	郝永平, 曹昭睿, 白帆, 等. 基于兴趣区域掩码卷积神经网络的红外-可见光图像融合与目标识别算法研究[J]. 光子学报, 2021, 50(2): 15-16. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202102010.htm HAO Yongping, CAO Zhaorui, BAI fan, et al. Research on infrared visible image fusion and target recognition algorithm based on region of interest mask convolution neural network[J]. Acta Photonica Sinica, 2021, 50(2): 15-16. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202102010.htm
[7]	李舒涵, 许宏科, 武治宇. 基于红外与可见光图像融合的交通标志检测[J]. 现代电子技术, 2020, 43(3): 45-49. https://www.cnki.com.cn/Article/CJFDTOTAL-XDDJ202003012.htm LI Shuhan, XU Hongke, WU Zhiyu. Traffic sign detection based on infrared and visible image fusion [J]. Modern Electronic Technology, 2020, 43(3): 45-49. https://www.cnki.com.cn/Article/CJFDTOTAL-XDDJ202003012.htm
[8]	XIAO X, WANG B, MIAO L, et al. Infrared and visible image object detection via focused feature enhancement and cascaded semantic extension[J]. Remote Sensing, 2021, 13(13): 2538.
[9]	Banuls A, Mandow A, Vazquez-Martin R, et al. Object detection from thermal infrared and visible light cameras in search and rescue scenes[C]// 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR). IEEE, 2020: 380-386.
[10]	李章维, 胡安顺, 王晓飞. 基于视觉的目标检测方法综述[J]. 计算机工程与应用, 2020, 56(8): 7-15. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202008002.htm LI Zhangwei, HU Anshun, WANG Xiaofei. Overview of vision based target detection methods[J]. Computer Engineering and Application, 2020, 56(8): 7-15. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202008002.htm
[11]	汪廷. 红外图像与可见光图像融合研究与应用[D]. 西安: 西安理工大学, 2019. WANG Ting. Research and Application of Infrared Image and Visible Image Fusion[D]. Xi'an: Xi'an University of Technology, 2019.
[12]	XIANG X, LV N, YU Z, et al. Cross-modality person re-identification based on dual-path multi-branch network[J]. IEEE Sensors Journal, 2019, 19(23): 11706-11713.
[13]	REN S, HE K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6): 1137-1149.
[14]	Bochkovskiy A, WANG C Y, LIAO H. YOLOv4: Optimal speed and accuracy of object detection[J/OL]. Arxiv Preprint Arxiv, https://arxiv.org/abs/2004.10934.
[15]	LIU W, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector[C]// European Conference on Computer Vision, 2016: 21-37.
[16]	TIAN Z, SHEN C, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2020: 9626-9635.
[17]	DUAN K, XIE L, QI H, et al. Location-sensitive visual recognition with cross-IOU loss[J/OL]. Arxiv Preprint Arxiv, https://arxiv.org/abs/2104.04899.
[18]	LI C, ZHAO N, LU Y, et al. Weighted sparse representation regularized graph learning for RGB-T object tracking[C]// ACM on Multimedia Conference. ACM, 2017: 1856-1864.
[19]	白玉, 侯志强, 刘晓义, 等. 基于可见光图像和红外图像决策级融合的目标检测算法[J]. 空军工程大学学报, 2020, 21(6): 53-59. https://www.cnki.com.cn/Article/CJFDTOTAL-KJGC202006009.htm BAI Yu, HOU Zhiqiang, LIU Xiaoyi, et al. Target detection algorithm based on decision level fusion of visible and infrared images[J]. Journal of Air Force Engineering University, 2020, 21(6): 53-59. https://www.cnki.com.cn/Article/CJFDTOTAL-KJGC202006009.htm