留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法

罗迪 王从庆 周勇军

罗迪, 王从庆, 周勇军. 一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法[J]. 红外技术, 2021, 43(6): 566-574.
引用本文: 罗迪, 王从庆, 周勇军. 一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法[J]. 红外技术, 2021, 43(6): 566-574.
LUO Di, WANG Congqing, ZHOU Yongjun. A Visible and Infrared Image Fusion Method based on Generative Adversarial Networks and Attention Mechanism[J]. Infrared Technology , 2021, 43(6): 566-574.
Citation: LUO Di, WANG Congqing, ZHOU Yongjun. A Visible and Infrared Image Fusion Method based on Generative Adversarial Networks and Attention Mechanism[J]. Infrared Technology , 2021, 43(6): 566-574.

一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法

基金项目: 

近地面探测技术重点实验室基金资助项目 TCGZ2019A006

详细信息
    作者简介:

    罗迪(1995-),男,硕士研究生,主要研究方向:深度学习与无人机目标检测。E-mail:1366701808@qq.com

    通讯作者:

    周勇军(1972),男,高级工程师,主要研究方向:近地面目标探测技术。E-mail:478992155@qq.com

  • 中图分类号: TN753

A Visible and Infrared Image Fusion Method based on Generative Adversarial Networks and Attention Mechanism

  • 摘要: 针对低照度可见光图像中目标难以识别的问题,提出了一种新的基于生成对抗网络的可见光和红外图像的融合方法,该方法可直接用于RGB三通道的可见光图像和单通道红外图像的融合。在生成对抗网络中,生成器采用具有编码层和解码层的U-Net结构,判别器采用马尔科夫判别器,并引入注意力机制模块,使得融合图像可以更关注红外图像上的高强度信息。实验结果表明,该方法在维持可见光图像细节纹理信息的同时,引入红外图像的主要目标信息,生成视觉效果良好、目标辨识度高的融合图像,并在信息熵、结构相似性等多项客观指标上表现良好。
  • 图  1  生成器的网络结构

    Figure  1.  The network structure of the generator

    图  2  CBAM的网络结构

    Figure  2.  The network structure of the CBAM

    图  3  判别器的网络结构

    Figure  3.  The network structure of the discriminator

    图  4  融合图像对比

    Figure  4.  The comparison results of fusion images

    表  1  生成器网络参数

    Table  1.   The parameters of generator

    Convolution layer Kernel size/stride Padding Input size Output size
    Conv1 4×4/2 (1, 1) 480×640×4 240×320×32
    CBAM 4×4/2 (1, 1) 240×320×32 240×320×32
    Conv2 4×4/2 (1, 1) 240×320×32 120×160×64
    Conv3 4×4/2 (1, 1) 120×160×64 60×80×128
    Conv4 4×4/2 (1, 1) 60×80×128 30×40×256
    Conv5 4×4/2 (2, 1) 30×40×256 16×20×512
    Conv6 4×4/2 (1, 1) 16×20×512 8×10×512
    Conv7 4×4/2 (1, 2) 8×10×512 4×6×512
    Conv8 4×4/2 (1, 1) 4×6×512 2×3×512
    ConvTrans8 4×4/2 (1, 1) 2×3×512 4×6×512
    ConvTrans7 4×4/2 (1, 2) 4×6×1024 8×10×512
    ConvTrans6 4×4/2 (1, 1) 8×10×1024 16×20×512
    ConvTrans5 4×4/2 (2, 1) 16×10×1024 30×40×256
    ConvTrans4 4×4/2 (1, 1) 30×40×512 60×80×128
    ConvTrans3 4×4/2 (1, 1) 60×80×256 120×160×64
    ConvTrans2 4×4/2 (1, 1) 120×160×128 240×320×32
    ConvTrans1 4×4/2 (1, 1) 240×320×64 480×640×3
    下载: 导出CSV

    表  2  判别器参数

    Table  2.   The parameters of discriminator

    Convolution layer Kernel size/stride Padding Output size
    Conv1 4×4/2 (1, 1) 240×320×64
    Conv2 4×4/2 (1, 1) 120×160×128
    Conv3 4×4/2 (1, 1) 60×80×256
    Conv4 4×4/2 (1, 1) 30×40×512
    Conv5 4×4/2 (1, 1) 15×20×512
    Conv6 1×1/1 (0, 0) 15×20×1
    下载: 导出CSV

    表  3  融合图像客观指标值

    Table  3.   The quantitative comparisons of fusion images

    Fusion methods EN MI FMI SSIM CC PSNR
    LP 5.918 11.836 0.944 0.681 0.646 68.496
    LP-SR 6.393 12.785 0.945 0.823 0.566 67.801
    NSCT 5.821 11.643 0.942 0.671 0.652 68.575
    NSCT-SR 6.224 12.447 0.940 0.859 0.575 67.472
    DTCWT 5.804 11.608 0.942 0.670 0.647 68.570
    DTCWT-SR 6.455 12.910 0.945 0.782 0.525 67.338
    DenseFuse 6.036 12.071 0.939 0.631 0.684 67.319
    CBAM-GAN 5.918 11.836 0.928 0.796 0.649 68.751
    Avarage 6.111 12.223 0.941 0.740 0.606 67.967
    下载: 导出CSV
  • [1] MA J, MA Y, LI C. Infrared and visible image fusion methods and applications: a survey[J]. Information Fusion, 2019, 45: 153-178. doi:  10.1016/j.inffus.2018.02.004
    [2] Burt P J, Adelson E H. The Laplacian pyramid as a compact image code[J]. Readings in Computer Vision, 1987, 31(4): 671-679. https://www.sciencedirect.com/science/article/pii/B9780080515816500659
    [3] Selesnick I W, Baraniuk R G, Kingsbury N C. The dual-tree complex wavelet transform[J]. IEEE Signal Processing Magazine, 2005, 22(6): 123-151. doi:  10.1109/MSP.2005.1550194
    [4] A L da Cunha, J Zhou, M N Do. Nonsubsampled contourilet transform: filter design and applications in denoising[C]//IEEE International Conference on Image Processing 2005, 749: (doi: 10.1109/ICIP.2005.1529859).
    [5] Hariharan H, Koschan A, Abidi M. The direct use of curvelets in multifocus fusion[C]//16th IEEE International Conference on Image Processing (ICIP), 2009: 2185-2188(doi: 10.1109/ICIP.2009.5413840).
    [6] LI Hui. Dense fuse: a fusion approach to infrared and visible images[C]//IEEE Transactions on Image Processing, 2018, 28: 2614- 2623(doi: 0.1109/TIP.2018.2887342).
    [7] MA J, YU W, LIANG P, et al. Fusion GAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26. doi:  10.1016/j.inffus.2018.09.004
    [8] Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-assisted Intervention, 2015: 234-241.
    [9] Hwang S, Park J, Kim N, et al. Multispectral pedestrian detection: Benchmark dataset and baseline[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 1037-1045.
    [10] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems, 2014: 2672-2680.
    [11] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks[J/OL][2015-11-07]. arXiv preprint arXiv: 1511.06434, 2015: https://arxiv.org/abs/1511.06434v1.
    [12] MAO X, LI Q, XIE H, et al. Least squares generative adversarial networks[C]//2017 IEEE International Conference on Computer Vision (ICCV), 2017: 2813-2821(doi: 10.1109/ICCV.2017.304).
    [13] Isola Phillip, ZHU Junyan, ZHOU Tinghui, et al. Image-to-image translation with conditional adversarial networks, 2017: 5967-5976 (doi: 10.1109/CVPR.2017.632).
    [14] Jaderberg M, Simonyan K, Zisserman A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems, 2015: 2017-2025.
    [15] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
    [16] Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.
  • 加载中
图(4) / 表(3)
计量
  • 文章访问数:  323
  • HTML全文浏览量:  69
  • PDF下载量:  89
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-09-08
  • 修回日期:  2020-10-12
  • 刊出日期:  2021-06-20

目录

    /

    返回文章
    返回