留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于自适应注意力机制的红外与可见光图像目标检测

赵松璞 杨利萍 赵昕 彭志远 梁东兴 梁洪军

赵松璞, 杨利萍, 赵昕, 彭志远, 梁东兴, 梁洪军. 基于自适应注意力机制的红外与可见光图像目标检测[J]. 红外技术, 2024, 46(4): 443-451.
引用本文: 赵松璞, 杨利萍, 赵昕, 彭志远, 梁东兴, 梁洪军. 基于自适应注意力机制的红外与可见光图像目标检测[J]. 红外技术, 2024, 46(4): 443-451.
ZHAO Songpu, YANG Liping, ZHAO Xin, PENG Zhiyuan, LIANG Dongxing, LIANG Hongjun. Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism[J]. Infrared Technology , 2024, 46(4): 443-451.
Citation: ZHAO Songpu, YANG Liping, ZHAO Xin, PENG Zhiyuan, LIANG Dongxing, LIANG Hongjun. Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism[J]. Infrared Technology , 2024, 46(4): 443-451.

基于自适应注意力机制的红外与可见光图像目标检测

基金项目: 

深圳市科技计划项目 JSGG20210802153009029

详细信息
    作者简介:

    赵松璞(1973-),男,汉族,陕西西安人,硕士,工程师。研究方向:机器人技术、智能电网、模式识别。E-mail: 1419446206@qq.com

  • 中图分类号: TP391.41

Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism

  • 摘要: 针对红外和可见光目标检测方法存在的不足,将深度学习技术与多源目标检测相结合,提出了一种基于自适应注意力机制的目标检测方法。该方法首先以深度可分离卷积为核心构建双源特征提取结构,分别提取红外和可见光目标特征。其次,为充分互补目标多模态信息,设计了自适应注意力机制,以数据驱动的方式加权融合红外和可见光特征,保证特征充分融合的同时降低噪声干扰。最后,针对多尺度目标检测,将自适应注意力机制结合多尺度参数来提取并融合目标全局和局部特征,提升尺度不变性。通过实验表明,所提方法相较于同类型目标检测算法能够准确高效地在复杂场景下实现目标识别和定位,并且在实际变电站设备检测中,该方法也体现出更高的泛化性和鲁棒性,可以有效辅助机器人完成目标检测任务。
  • 图  1  红外-可见光目标检测整体架构

    Figure  1.  Overall framework of infrared-visible light object detection

    图  2  特征提取模块

    Figure  2.  Feature extraction modules

    图  3  AAM特征融合

    Figure  3.  AAM feature fusion

    图  4  AAM多尺度检测

    Figure  4.  AAM multiscale detection

    图  5  单源与双源网络检测结果对比

    Figure  5.  Comparison of single source and dual source network detection results

    图  6  Block 3特征融合结果对比

    Figure  6.  Block 3 feature fusion results comparison

    图  7  多尺度融合结果对比

    Figure  7.  Comparison of multiscale fusion results

    图  8  红外-可见光网络检测效果对比(前三排:RGBT210;后两排:KAIST)

    Figure  8.  Comparison of infrared-visible network detection effects(The first three rows: RGBT210; The last two rows: KAIST)

    图  9  变电站设备检测效果对比

    Figure  9.  Comparison of substation equipment detection effects

    表  1  特征提取支路

    Table  1.   Feature extraction branch

    Module Layer Repetitions Output
    Input RGB, 3 1 512×448
    Init Conv 3×3, 10
    DWconv 3×3, 3
    1 256×224
    Max pooling 2×2, 3
    Block 1 DWconv 3×3, 32
    Residual
    1 128×112
    Block 2 DWconv 3×3, 64
    Residual
    2 64×56
    Block 3 DWconv 3×3, 128
    Residual
    3 32×28
    Block 4 DWconv 3×3, 256
    Residual
    3 16×14
    Block 5 DWconv 3×3, 512
    Residual
    2 8×7
    下载: 导出CSV

    表  2  网络训练超参及策略

    Table  2.   Network training hyperparameter and strategy

    Parameter Value
    Batch_Size 4
    Base_Lr 0.01
    Momentum 0.95
    Weight_Decay 0.0005
    Learning step
    Optimization Adam
    Loss function Cross Entropy
    下载: 导出CSV

    表  3  单源网络测试对比

    Table  3.   Single source network test comparison

    Network FPS Accuracy/(%)
    mAP mAPs mAPm mAPl
    YOLO[14] 67 70.6 49.8 73.8 81.1
    MobileNet[15] 107 69.7 48.7 72.2 79.5
    Ours 121 69.3 48.1 71.9 79.3
    下载: 导出CSV

    表  4  双源特征融合测试对比

    Table  4.   Comparison of dual source feature fusion

    Network FPS Accuracy/(%)
    mAP mAPs mAPm mAPl
    Infrared branch 120 62.1 43.8 65.9 72.8
    Visible branch 121 69.3 48.1 71.9 79.3
    SE fusion 89 71.2 51.5 74.7 82.4
    CBAM fusion 87 72.0 52.4 75.2 83.1
    AAM fusion 86 72.6 53.8 75.9 83.6
    下载: 导出CSV

    表  5  多尺度结构对比

    Table  5.   Multiscale structure comparison

    Network FPS Accuracy /(%)
    mAP mAPs mAPm mAPl
    Pyramid multiscale 86 72.6 53.8 75.9 83.6
    AAM multiscale 84 73.5 54.9 76.4 84.0
    下载: 导出CSV

    表  6  同类方法测试对比

    Table  6.   Test comparison of similar methods

    Network FPS Accuracy /(%)
    mAP mAPs mAPm mAPl
    Reference [6] 105 67.4 48.3 70.5 77.8
    Reference [10] 79 71.3 51.0 73.6 81.2
    Reference [24] 73 72.9 53.3 76.1 83.9
    Ours 84 73.5 54.9 76.4 84.0
    下载: 导出CSV

    表  7  KAIST数据集测试对比

    Table  7.   Test comparison of KAIST dataset

    Network FPS Accuracy /(%)
    mAP mAPs mAPm mAPl
    Reference [6] 107 72.8 50.3 77.6 85.3
    Reference [10] 80 77.5 53.1 81.2 88.1
    Reference [24] 74 78.6 54.7 73.1 89.9
    Ours 85 79.0 55.9 73.4 90.3
    下载: 导出CSV

    表  8  变电站设备检测测试对比

    Table  8.   Comparison of inspection and test of substation equipment

    Network FPS Accuracy /(%)
    mAP mAPs mAPm mAPl
    Reference [6] 26 84.2 64.3 86.7 92.1
    Reference [10] 18 86.3 66.5 88.2 95.3
    Reference [24] 15 88.0 68.2 90.1 96.7
    Ours 20 88.5 69.1 90.3 97.0
    下载: 导出CSV
  • [1] 王灿, 卜乐平. 基于卷积神经网络的目标检测算法综述[J]. 舰船电子工程, 2021, 41(9): 161-169. https://www.cnki.com.cn/Article/CJFDTOTAL-JCGC202109036.htm

    WANG Can, BU Leping. Overview of target detection algorithms based on convolutional neural networks[J]. Naval Electronic Engineering, 2021, 41(9): 161-169. https://www.cnki.com.cn/Article/CJFDTOTAL-JCGC202109036.htm
    [2] 郝永平, 曹昭睿, 白帆, 等. 基于兴趣区域掩码卷积神经网络的红外-可见光图像融合与目标识别算法研究[J]. 光子学报, 2021, 50(2): 84-98. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202102010.htm

    HAO Yongping, CAO Zhaorui, BAI Fan, et al Research on infrared visible image fusion and target recognition algorithm based on region of interest mask convolution neural network[J]. Acta PHOTONICA Sinica, 2021, 50 (2): 84-98 https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202102010.htm
    [3] 刘齐, 王茂军, 高强, 等. 基于红外成像技术的电气设备故障检测[J]. 电测与仪表, 2019, 56(10): 122-126. https://www.cnki.com.cn/Article/CJFDTOTAL-DCYQ201910020.htm

    LIU Qi, WANG Maojun, GAO Qiang, et al Electrical equipment fault detection based on infrared imaging technology[J]. Electric Measurement and Instrumentation, 2019, 56(10): 122-126. https://www.cnki.com.cn/Article/CJFDTOTAL-DCYQ201910020.htm
    [4] XIA J, LU Y, TAN L, et al. Intelligent fusion of infrared and visible image data based on convolutional sparse representation and improved pulse-coupled neural network[J]. Computers, Materials and Continua, 2021, 67(1): 613-624. doi:  10.32604/cmc.2021.013457
    [5] 汪勇, 张英, 廖如超, 等. 基于可见光、热红外及激光雷达传感的无人机图像融合方法[J]. 激光杂志, 2020, 41(2): 141-145. https://www.cnki.com.cn/Article/CJFDTOTAL-JGZZ202002029.htm

    WANG Yong, ZHANG Ying, LIAO Ruchao, et al. UAV image fusion method based on visible light, thermal infrared and lidar sensing[J]. Laser Journal, 2020, 41(2): 141-145. https://www.cnki.com.cn/Article/CJFDTOTAL-JGZZ202002029.htm
    [6] ZHANG S, LI X, ZHANG X, et al. Infrared and visible image fusion based on saliency detection and two-scale transform decomposition[J]. Infrared Physics & Technology, 2021, 114(3): 103626.
    [7] 王传洋. 基于红外与可见光图像的电力设备识别的研究[D]. 北京: 华北电力大学, 2017.

    WANG Chuanyang. Research on Power Equipment Recognition Based on Infrared and Visible Images[D]. Beijing: North China Electric Power University, 2017.
    [8] LI H, WU X J. Infrared and visible image fusion using Latent low-rank representation[J]. Arxiv Preprint Arxiv, 2018: 1804.08992.
    [9] HUI L, WU X J. DenseFuse: A fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2018, 28(5): 2614-2623.
    [10] 唐聪, 凌永顺, 杨华, 等. 基于深度学习的红外与可见光决策级融合跟踪[J]. 激光与光电子学进展, 2019, 56(7): 209-216. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ201907023.htm

    TANG Cong, LING Yongshun, YANG Hua, et al. Decision-level fusion tracking of infrared and visible light based on deep learning[J]. Advances in Lasers and Optoelectronics, 2019, 56(7): 209-216. https://www.cnki.com.cn/Article/CJFDTOTAL-JGDJ201907023.htm
    [11] MA J, TANG L, XU M, et al. STDFusionNet: an infrared and visible image fusion network based on salient object detection[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1-13.
    [12] 杨雪鹤, 刘欢喜, 肖建力. 多模态生物特征提取及相关性评价综述[J]. 中国图象图形学报, 2020, 25(8): 1529-1538. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202008002.htm

    YANG Xuehe, LIU Huanxi, XIAO Jianli. A review of multimodal biometric feature extraction and correlation evaluation[J]. Chinese Journal of Image and Graphics, 2020, 25(8): 1529-1538. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGTB202008002.htm
    [13] WANG Z, XIN Z, HUANG X, et al. Overview of SAR image feature extraction and object recognition[J]. Springer, 2021, 234(4): 69-75.
    [14] WEI Z. A summary of research and application of deep learning[J]. International Core Journal of Engineering, 2019, 5(9): 167-169.
    [15] Bochkovskiy A, WANG C Y, LIAO H. YOLOv4: Optimal speed and accuracy of object detection[J]. Arxiv Preprint Arxiv, 2020: 2004.10934.
    [16] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770-778.
    [17] Howard A, Sandler M, Chen B, et al. Searching for MobileNetV3 [C]//IEEE International Conference on Computer Vision (ICCV), 2020: 1314-1324.
    [18] CHEN H, WANG Y, XU C, et al. AdderNet: Do we really need multiplications in deep learning?[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 1465-1474.
    [19] 宋鹏汉, 辛怀声, 刘楠楠. 基于深度学习的海上舰船目标多源特征融合识别[J]. 中国电子科学研究院学报, 2021, 16(2): 127-133. https://www.cnki.com.cn/Article/CJFDTOTAL-KJPL202102004.htm

    SONG Penghan, XIN Huaisheng, LIU Nannan. Multi-source feature fusion recognition of marine ship targets based on deep learning[J]. Journal of the Chinese Academy of Electronic Sciences, 2021, 16(2): 127-133. https://www.cnki.com.cn/Article/CJFDTOTAL-KJPL202102004.htm
    [20] Hassan E. Multiple object tracking using feature fusion in hierarchical LSTMs[J]. The Journal of Engineering, 2020(10): 893-899.
    [21] LIN T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 936-944.
    [22] LIU S, HUANG D, WANG Y. Learning spatial fusion for single-shot object detection[J]. Arxiv Preprint Arxiv, 2019: 1911.09516v1.
    [23] LI C, ZHAO N, LU Y, et al. Weighted sparse representation regularized graph learning for RGB-T object tracking[C]// Acm on Multimedia Conference, ACM, 2017: 1856-1864.
    [24] XIAO X, WANG B, MIAO L, et al. Infrared and visible image object detection via focused feature enhancement and cascaded semantic extension[J]. Remote Sensing, 2021, 13(13): 2538. doi:  10.3390/rs13132538
  • 加载中
图(9) / 表(8)
计量
  • 文章访问数:  40
  • HTML全文浏览量:  15
  • PDF下载量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-08-30
  • 修回日期:  2022-09-28
  • 刊出日期:  2024-04-20

目录

    /

    返回文章
    返回