LI Qiuheng, DENG Hao, LIU Guihua, PANG Zhongxiang, TANG Xue, ZHAO Junqin, LU Mengyuan. Infrared and Visible Images Fusion Method Based on Multi-Scale Features and Multi-head Attention[J]. Infrared Technology , 2024, 46(7): 765-774.
Citation: LI Qiuheng, DENG Hao, LIU Guihua, PANG Zhongxiang, TANG Xue, ZHAO Junqin, LU Mengyuan. Infrared and Visible Images Fusion Method Based on Multi-Scale Features and Multi-head Attention[J]. Infrared Technology , 2024, 46(7): 765-774.

Infrared and Visible Images Fusion Method Based on Multi-Scale Features and Multi-head Attention

More Information
  • Received Date: August 23, 2023
  • Revised Date: September 19, 2023
  • Available Online: July 24, 2024
  • To address the challenges of detail loss and the imbalance between visual detail features and infrared (IR) target features in fused infrared and visible images, this study proposes a fusion method combining multiscale feature fusion and efficient multi-head self-attention (EMSA). The method includes several key steps. 1) Multiscale coding network: It utilizes a multiscale coding network to extract multilevel features, enhancing the descriptive capability of the scene. 2) Fusion strategy: It combines transformer-based EMSA with dense residual blocks to address the imbalance between local details and overall structure in the fusion process. 3) Nested-connection based decoding network: It takes the multilevel fusion map and feeds it into a nested-connection based decoding network to reconstruct the fused result, emphasizing prominent IR targets and rich scene details. Extensive experiments on the TNO and M3FD public datasets demonstrate the efficacy of the proposed method. It achieves superior results in both quantitative metrics and visual comparisons. Specifically, the proposed method excels in targeted detection tasks, demonstrating state-of-the-art performance. This approach not only enhances the fusion quality by effectively preserving detailed information and balancing visual and IR features but also establishes a benchmark in the field of infrared and visible image fusion.

  • [1]
    王天元, 罗晓清, 张战成. 自注意力引导的红外与可见光图像融合算法[J]. 红外技术, 2023, 45(2): 171-177. http://hwjs.nvir.cn/cn/article/id/09b45ee5-6ebc-4222-a4ec-11b5142482fe

    WANG T Y, LUO X Q, ZHANG Z C. Self-attention guided fusion algorithm for infrared and visible images[J]. Infrared Technology, 2023, 45(2): 171-177. http://hwjs.nvir.cn/cn/article/id/09b45ee5-6ebc-4222-a4ec-11b5142482fe
    [2]
    KUMAR B K S. Multifocus multispectral image fusion based on pixel significance using disrcret cosin harmonic wavelet transform[J]. Signal Image & . Video Processing, 2013, 7(6): 1125-1143.
    [3]
    KUMAR B K S. Image fusion based on pixel significance using cross-bilateral filter[J]. Signal Image & Video Processing, 2015, 9(5): 1193-1204.
    [4]
    LI H, QIU H, YU Z, et al. Infrared and visible image fusion scheme based on NSCT and low-level visual features[J]. Infrared Physics & Technology, 2016, 76: 174-184.
    [5]
    HOU J L, ZHANG D Z, WEI W, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26. DOI: 10.1016/j.inffus.2018.09.004
    [6]
    Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural net-works[J]. Science, 2006, 313(5786): 504-507. DOI: 10.1126/science.1127647
    [7]
    LI H, WU X J. DenseFuse: A fusiona pproach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2018, 28(5): 2614-2623.
    [8]
    HUANG G, LIU Z, LAURENSVD M, et al. Densely connected convolutional networks[C]// IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2261-2269.
    [9]
    LI H, WU X J, Kittler J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72-86. DOI: 10.1016/j.inffus.2021.02.023
    [10]
    Vibashan V S, Valanarasu J, Oza P, et al, et al. Image fusion transformer [J/OL]. arXiv preprint arXiv: 2107.09011. 2021. https://ieeexplore.ieee.org/document/9897280.
    [11]
    LI H, WU X J, Durrani T. NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 69(12): 9645-9656. DOI: 10.1109/TIM.2020.3005230
    [12]
    黄玲琳, 李强, 路锦正, 等. 基于多尺度和注意力模型的红外与可见光图像融合[J]. 红外技术, 2023, 45(2): 143-149. http://hwjs.nvir.cn/cn/article/id/10e9d4ea-fb05-43a5-817a-bcad09f693b8

    HUANG L L, LI Q, LU J Z, et al. Infrared and visible image fusion based on multi-scale and attention modeling[J]. Infrared Technology, 2023, 45(2): 143-149. http://hwjs.nvir.cn/cn/article/id/10e9d4ea-fb05-43a5-817a-bcad09f693b8
    [13]
    Zamir S W, Arora A, Khan S, et al. Restormer: efficient transformer for high-resolution image restoration[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022: 5718-5729.
    [14]
    WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 11531-11539.
    [15]
    LIN T Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context[C]//Computer Vision-ECCV, 2014: 740-755.
    [16]
    WANG S H, Park J, Kim N, et al. Multispectral pedestrian detection: Benchmark dataset and baseline[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 1037-1045.
    [17]
    TOET A. The TNO multi band image data collection[J]. Data in Brief, 2017, 15: 249-251. DOI: 10.1016/j.dib.2017.09.038
    [18]
    LIU J, FAN X, HUANG Z B, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022: 5792-5801.
    [19]
    XU H, MA J Y, JIANG J J, et al. U2Fusion: a unified unsupervised image fusion network[J]. IEEE Trans. Pattern Anal. Mach. Intell., 2022, 44(1): 502-518. DOI: 10.1109/TPAMI.2020.3012548
    [20]
    WANG C Y, Bochkovskiy A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023: 7464-7475, DOI: 10.1109/CVPR52729.2023.00721.
  • Related Articles

    [1]CHEN Zhuang, HE Feng, HONG Xiaohang, ZHANG Qiran, YANG Yuyan. Embedded Platform IR Small-target Detection Based on Self-attention and Convolution Fused Architecture[J]. Infrared Technology , 2025, 47(1): 89-96.
    [2]DI Jing, LIANG Chan, REN Li, GUO Wenqing, LIAN Jing. Infrared and Visible Image Fusion Based on Multi-Scale Contrast Enhancement and Cross-Dimensional Interactive Attention Mechanism[J]. Infrared Technology , 2024, 46(7): 754-764.
    [3]ZHAO Songpu, YANG Liping, ZHAO Xin, PENG Zhiyuan, LIANG Dongxing, LIANG Hongjun. Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism[J]. Infrared Technology , 2024, 46(4): 443-451.
    [4]HE Le, LI Zhongwei, LUO Cai, REN Peng, SUI Hao. Infrared and Visible Image Fusion Based on Dilated Convolution and Dual Attention Mechanism[J]. Infrared Technology , 2023, 45(7): 732-738.
    [5]CHEN Xin. Infrared and Visible Image Fusion Using Double Attention Generative Adversarial Networks[J]. Infrared Technology , 2023, 45(6): 639-648.
    [6]CHEN Yanlin, WANG Zhishe, SHAO Wenyu, YANG Fan, SUN Jing. Multi-scale Transformer Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2023, 45(3): 266-275.
    [7]WANG Tianyuan, LUO Xiaoqing, ZHANG Zhancheng. Infrared and Visible Image Fusion Based on Self-attention Learning[J]. Infrared Technology , 2023, 45(2): 171-177.
    [8]HUANG Linglin, LI Qiang, LU Jinzheng, HE Xianzhen, PENG Bo. Infrared and Visible Image Fusion Based on Multi-scale and Attention Model[J]. Infrared Technology , 2023, 45(2): 143-149.
    [9]CHEN Da, HE Quancai, DI Erzhen, DENG Zaozhu. Application of Partial Differential Segmentation Model with Adaptive Weight in Infrared Image of Substation Equipment[J]. Infrared Technology , 2022, 44(2): 179-188.
    [10]WU Yuanyuan, WANG Zhishe, WANG Junyao, SHAO Wenyu, CHEN Yanlin. Infrared and Visible Image Fusion Using Attention- Based Generative Adversarial Networks[J]. Infrared Technology , 2022, 44(2): 170-178.
  • Cited by

    Periodical cited type(1)

    1. 杨晓超,郝慧良. 矿用电缆放电监测系统研究设计. 中国煤炭. 2024(S1): 406-410 .

    Other cited types(0)

Catalog

    Article views (144) PDF downloads (68) Cited by(1)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return