CHEN Yanlin, WANG Zhishe, SHAO Wenyu, YANG Fan, SUN Jing. Multi-scale Transformer Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2023, 45(3): 266-275.
Citation: CHEN Yanlin, WANG Zhishe, SHAO Wenyu, YANG Fan, SUN Jing. Multi-scale Transformer Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2023, 45(3): 266-275.

Multi-scale Transformer Fusion Method for Infrared and Visible Images

More Information
  • Received Date: August 22, 2022
  • Revised Date: September 12, 2022
  • Mainstream fusion methods based on deep learning employ a convolutional operation to extract local image features; however, the interaction between an image and convolution kernel is content-independent, and the long-range dependency cannot be well modeled. Consequently, the loss of important contextual information may be unavoidable and further limit the fusion performance of infrared and visible images. To this end, we present a simple and effective fusion network for infrared and visible images, namely, the multiscale transformer fusion method (MsTFusion). We first designed a novel Conv Swin Transformer block to model long-range dependency. A convolutional layer was used to improve the representative ability of the global features. Subsequently, we constructed a multiscale self-attentional encoding-decoding network to extract and reconstruct global features without the help of local features. Moreover, we designed a learnable fusion layer for feature sequences that employed softmax operations to calculate the attention weight of the feature sequences and highlight the salient features of the source image. The proposed method is an end-to-end model that uses a fully attentional model to interact with image content and attention weights. We conducted a series of experiments on TNO and road scene datasets, and the experimental results demonstrated that the proposed MsTFusion transcended other methods in terms of subjective visual observations and objective indicator comparisons. By integrating the self-attention mechanism, our method built a fully attentional fusion model for infrared and visible image fusion and modeled the long-range dependency for global feature extraction and reconstruction to overcome the limitations of deep learning-based models. Compared with other state-of-the-art traditional and deep learning methods, MsTFusion achieved remarkable fusion performance with strong generalization ability and competitive computational efficiency.
  • [1]
    Paramanandham N, Rajendiran K. Multi sensor image fusion for surveillance applications using hybrid image fusion algorithm[J]. Multimedia Tools and Applications, 2018, 77(10): 12405-12436. DOI: 10.1007/s11042-017-4895-3
    [2]
    ZHANG Xingchen, YE Ping, QIAO Dan, et al. Object fusion tracking based on visible and infrared images: a comprehensive review[J]. Information Fusion, 2020, 63: 166-187. DOI: 10.1016/j.inffus.2020.05.002
    [3]
    TU Zhengzheng, LI Zhun, LI Chenglong, et al. Multi-interactive dual- decoder for RGB-thermal salient object detection[J]. IEEE Transactions on Image Processing, 2021, 30: 5678-5691. DOI: 10.1109/TIP.2021.3087412
    [4]
    汪荣贵, 王静, 杨娟, 等. 基于红外和可见光模态的随机融合特征金子塔行人重识别[J]. 光电工程, 2020, 47(12): 190669. Doi: 10.12086/oee.2020.190669.

    WANG Ronggui, WANG Jing, YANG Juan, et al. Random feature fusion of golden Tower for pedestrian rerecognition based on infrared and visible modes[J]. Opto-Electronic Engineering, 2020, 47(12): 190669. Doi: 10.12086/oee.2020.190669
    [5]
    WANG Zhishe, XU Jiawei, JIANG Xiaolin, et al. Infrared and visible image fusion via hybrid decomposition of NSCT and morphological sequential toggle operator[J]. Optik, 2020, 201: 163497. DOI: 10.1016/j.ijleo.2019.163497
    [6]
    LI Hui, WU Xiaojun, Kittle J. MDLatLRR: a novel decomposition method for infrared and visible image fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4733-4746. DOI: 10.1109/TIP.2020.2975984
    [7]
    孙彬, 诸葛吴为, 高云翔, 等. 基于潜在低秩表示的红外和可见光图像融合[J]. 红外技术, 2022, 44(8): 853-862. http://hwjs.nvir.cn/article/id/7fc3a60d-61bb-454f-ad00-e925eeb54576

    SUN Bin, ZHUGE Wuwei, GAO Yunxiang et al. Infrared and visible image fusion based on potential low-rank representation[J]. Infrared Technology, 2022, 44(8): 853-862. http://hwjs.nvir.cn/article/id/7fc3a60d-61bb-454f-ad00-e925eeb54576
    [8]
    MA Jinlei, ZHOU Zhiqiang, WANG Bo, et al. Infrared and visible image fusion based on visual saliency map and weighted least square optimization[J]. Infrared Physics & Technology, 2017, 82: 8-17.
    [9]
    KONG Weiwei, LEI Yang, ZHAO Huaixun. Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization[J]. Infrared Physics & Technology, 2014, 67: 161-172.
    [10]
    姜迈, 沙贵君, 李宁. 基于PUCS与DTCWT的红外与弱可见光图像融合[J]. 红外技术, 2022, 44(7): 716-725. http://hwjs.nvir.cn/article/id/ee43f5b8-9a1f-441c-9d95-e339989d8954

    JIANG Mai, SHA Guijun, LI Ning. Infrared and inferior visible image fusion based on PUCS and DTCWT [J]. Infrared Technology, 2022, 44(7): 716-725. http://hwjs.nvir.cn/article/id/ee43f5b8-9a1f-441c-9d95-e339989d8954
    [11]
    WANG Zhishe, YANG Fengbao, PENG Zhihao, et al. Multi-sensor image enhanced fusion algorithm based on NSST and top-hat transformation[J]. Optik, 2015, 126(23): 4184-4190. DOI: 10.1016/j.ijleo.2015.08.118
    [12]
    LIU Yu, CHEN Xun, PENG Hu, et al. Multi-focus imagefusion with a deep convolutional neural network[J]. Information Fusion, 2017, 36: 191-207. DOI: 10.1016/j.inffus.2016.12.001
    [13]
    ZHANG Hao, XU Han, TIAN Xin, et al. Image fusion meets deep learning: A survey and perspective[J]. Information Fusion, 2021, 76: 323-336. DOI: 10.1016/j.inffus.2021.06.008
    [14]
    ZHANG Yu, LIU Yu, SUN Peng, et al. IFCNN: A general image fusion framework based on convolutional neural network[J]. Information Fusion, 2020, 54: 99-118. DOI: 10.1016/j.inffus.2019.07.011
    [15]
    LI Hui, WU Xiaojun. DenseFuse: a fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2614- 2623. DOI: 10.1109/TIP.2018.2887342
    [16]
    LI Hui, WU Xiaojun, Kittler J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72-86. DOI: 10.1016/j.inffus.2021.02.023
    [17]
    JIAN Lihua, YANG Xiaomin, LIU Zheng, et al. SEDRFuse: A symmetric encoder–decoder with residual block network for infrared and visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2020, 70: 1-15.
    [18]
    ZHANG Hao, XU Han, XIAO Yang, et al. Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12797-12804.
    [19]
    WANG Zhishe, WANG Junyao, WU Yuanyuan, et al. UNFusion: a unified multi-scale densely connected network for infrared and visible image fusion[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(6): 3360- 3374.
    [20]
    WANG Zhishe; WU Yuanyuan; WANG Junyao, et al. Res2Fusion: infrared and visible image fusion based on dense Res2net and double non-local attention models[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-12.
    [21]
    MA Jiayi, YU Wei, LIANG Pengwei, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26.
    [22]
    MA Jiayi, ZHANG Hao, SHAO Zhenfeng, et al. GANMcC: a generative adversarial network with multiclassification constraints for infrared and visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1-14.
    [23]
    王志社, 邵文禹, 杨风暴, 等. 红外与可见光图像交互注意力生成对抗融合方法[J]. 光子学报, 2022, 51(4): 318-328. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202204029.htm

    WANG Zhishe, SHAO Wenyu, YANG Fengbao, et al. A generative antagonism fusion method for interactive attention of infrared and visible images [J]. Acta Photonica Sinica, 2022, 51(4): 318-328. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202204029.htm
    [24]
    LI Jing, ZHU Jianming, LI Chang, et al. CGTF: Convolution-Guided Transformer for Infrared and Visible Image Fusion [J]. IEEE Transactions on Instrumentation and Measurement. 2022, 71: 1-14.
    [25]
    RAO Dongyu, WU Xiaojun, XU Tianyang. TGFuse: An infrared and visible image fusion approach based on transformer and generative adversarial network [J/OL].arXiv preprint arXiv: 2201.10147. 2022.
    [26]
    WANG Zhishe, CHEN Yanlin, SHAO Wenyu, et al. SwinFuse: a residual swin transformer fusion network for infrared and visible images[J/OL]. arXiv preprint arXiv: 2204.11436. 2022.
    [27]
    ZHAO Haibo, NIE Rencan. DNDT: infrared and visible image fusion via DenseNet and dual-transformer[C]// International Conference on Information Technology and Biomedical Engineering (ICITBE), 2021: 71-75.
    [28]
    VS V, Valanarasu J M J, Oza P, et al. Image fusion transformer [J/OL]. arXiv preprint arXiv: 2107.09011. 2021.
    [29]
    LIU Ze, LIN Yutong, CAO Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012-10022.
    [30]
    TOET A. TNO Image Fusion Datase[DB/OL]. [2014-04-26].https://figshare.com/articles/TNImageFusionDataset/1008029.
    [31]
    XU Han. Roadscene Database[DB/OL]. [2020-08-07].https://github.com/hanna-xu/RoadScene.
    [32]
    LI Hui, WU Xiaojun, Kittle J. MDLatLRR: a novel decomposition method for infrared and visible image fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4733-4746.
  • Related Articles

    [1]YE Zhihui, WU Jian, ZHAO Xiaozhong, WANG Wenjuan, SHAO Xinguang. Multimodal Object Detection Based on Feature Interaction and Adaptive Grouping Fusion[J]. Infrared Technology , 2025, 47(4): 468-474.
    [2]LI Minglu, WANG Xiaoxia, HOU Maoxin, YANG Fengbao. An Object Detection Algorithm Based on Infrared-Visible Feature Enhancement and Fusion[J]. Infrared Technology , 2025, 47(3): 385-394.
    [3]QIAO Zhiping, HUANG Jingying, WANG Lihe. Infrared Dual-band Target Detecting Fusion Algorithm Based on Multiple Features[J]. Infrared Technology , 2024, 46(10): 1201-1208.
    [4]CHEN Sijing, FU Zhitao, LI Ziqian, NIE Han, SONG Jiawen. A Visible and Infrared Image Fusion Algorithm Based on Adaptive Enhancement and Saliency Detection[J]. Infrared Technology , 2023, 45(9): 907-914.
    [5]QU Haicheng, HU Qianqian, ZHANG Xuecong. Infrared and Visible Image Fusion Combining Information Perception and Multiscale Features[J]. Infrared Technology , 2023, 45(7): 685-695.
    [6]WANG Fang, LI Chuanqiang, WU Bo, YU Kun, JIN Chan, CHEN Yake, LU Yinghui. Infrared Small Target Detection Method Based on Multi-Scale Feature Fusion[J]. Infrared Technology , 2021, 43(7): 688-695.
    [7]WEI Shuigen, WANG Chengwei, ZHANG Congxuan, YAN Huibin. Infrared Dim Target Detection Based on Multi-information Fusion[J]. Infrared Technology , 2019, 41(9): 857-865.
    [8]YUAN Jingzhen, JIN Wang. Multi-scale Moving Target Detection Method Based on Improved Bilateral Filtering[J]. Infrared Technology , 2019, 41(8): 772-777.
    [9]ZHANG Shuang-lei, CHEN Fan-sheng, WANG Tao. A Dim Small Target Detection Algorithm Based on Multi-Features Fusion Algorithm[J]. Infrared Technology , 2015, (8): 635-641.
    [10]XIONG Da-rong, YANG Xuan. Long-Range Target Detection Based on Multisensor Data Fusion[J]. Infrared Technology , 2006, 28(12): 695-698. DOI: 10.3969/j.issn.1001-8891.2006.12.004

Catalog

    Article views (400) PDF downloads (166) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return