Infrared and Visible Image Fusion Using Double Attention Generative Adversarial Networks
-
摘要: 针对大多数基于GAN的红外与可见光图像融合方法仅在生成器使用注意力机制,而鉴别阶段缺乏注意力感知能力的问题,提出了一种基于双注意力机制生成对抗网络(double attention generative adversarial networks, DAGAN)的红外与可见光图像融合方法。DAGAN提出一种多尺度注意力模块,该模块在不同尺度空间中将空间注意力和通道注意力结合,并将其应用在图像生成阶段和鉴别阶段,使生成器和鉴别器均能感知图像中最具鉴别性的区域,同时提出了一种注意力损失函数,利用鉴别阶段的注意力图计算注意力损失,保存更多目标信息和背景信息。公开数据集TNO测试表明:与其他7种融合方法相比,DAGAN具有最好的视觉效果与最高的融合效率。Abstract: In this study, an infrared and visible image fusion using double attention generative adversarial networks(DAGAN) is proposed to address the issue of most infrared and visible light image fusion methods based on GaN using only the attention mechanism in the generator and lacking the attention perception ability in the identification stage. Using DAGAN, a multi-scale attention module that combines spatial and channel attentions in different scale spaces and applies it in the image generation and discrimination stages such that both the generator and discriminator can identify the most discriminative region in the image, was proposed. Simultaneously, an attention loss function that uses the attention map in the discrimination stage to calculate the attention loss and save more target and background information was proposed. The TNO test of a public dataset shows that, compared with the other seven fusion methods, DAGAN has the best visual effect and the highest fusion efficiency.
-
Keywords:
- infrared and visible image fusion /
- public security /
- GAN /
- attention mechanism /
- generator /
- discriminator
-
-
表 1 生成器网络结构
Table 1 Generator network structure
Network layer Multi scale attention module network architecture Converged network architecture First layer Conv(I1, O32, K3, S1, P1), PReLU Conv(I 4, O32, K3, S1, P1), PReLU Second layer Conv(I32, O32, K3, S1, P1), PReLU Conv(I 32, O64, K3, S1, P1), PReLU Third layer Conv(I32, O32, K3, S1, P1), PReLU Conv(I 64, O128, K3, S1, P1), PReLU Fourth layer Conv(I32, O32, K3, S1, P1), PReLU Conv(I 128, O1, K3, S1, P1), PReLU 表 2 鉴别器网络结构
Table 2 Discriminator network structure
Network layer Multi scale attention module network architecture First layer Conv(I1, O64, K3, S1, P0), LeakyReLU Second layer Conv(I64, O64, K3, S2, P0), LeakyReLU Third layer Conv(I64, O128, K3, S1, P0), LeakyReLU Fourth layer Conv(I128, O128, K3, S2, P0), LeakyReLU Fifth layer Conv(I128, O256, K3, S1, P0), LeakyReLU Sixth layer Conv(I256, O256, K3, S2, P0), LeakyReLU Seventh layer FC(1024) Eighth layer FC(1) 表 3 DAGAN与不同方法的计算时间对比
Table 3 Comparison of calculation time between DAGAN and different methods
s Method CVT DTCWT LP RP Wavelet NSCT FusionGAN DAGAN Computing time 0.7586 0.8024 0.4599 0.4615 0.6332 0.9839 0.2658 0.1882 -
[1] 董安勇, 杜庆治, 苏斌, 等. 基于卷积神经网络的红外与可见光图像融合[J]. 红外技术, 2020, 42(7): 660-669. http://hwjs.nvir.cn/article/id/hwjs202007009 DONG Anyong, DU Qingzhi, SU Bin, et al. Infrared and visible image fusion based on convolutional neural network[J]. Infrared Technology, 2020, 42(7): 660-669. http://hwjs.nvir.cn/article/id/hwjs202007009
[2] 罗迪, 王从庆, 周勇军. 一种基于生成对抗网络与注意力机制的可见光和红外图像融合方法[J]. 红外技术, 2021, 43(6): 566-574. http://hwjs.nvir.cn/article/id/3403109e-d8d7-45ed-904f-eb4bc246275a LUO Di, WANG Congqing, ZHOU Yongjun. A visible and infrared image fusion method based on generative adversarial networks and attention mechanism[J]. Infrared Technology, 2021, 43(6): 566-574. http://hwjs.nvir.cn/article/id/3403109e-d8d7-45ed-904f-eb4bc246275a
[3] CHEN R, XIE Y, LUO X, et al. Joint-attention discriminator for accurate super-resolution via adversarial training[C]//Proceedings of the 27th ACM International Conference on Multimedia, 2019: 711-719.
[4] LIU N, HAN J, YANG M-H. Picanet: pixel-wise contextual attention learning for accurate saliency detection[J]. IEEE Transactions on Image Processing, 2020, 29: 6438-6451. DOI: 10.1109/TIP.2020.2988568
[5] CHEN J, WAN L, ZHU J, et al. Multi-scale spatial and channel-wise attention for improving object detection in remote sensing imagery[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 17(4): 681-685.
[6] ZHOU B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2921-2929.
[7] Zagoruyko S, Komodakis N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[J/OL]. arXiv preprint arXiv: 161203928, 2016, 1: (https://doi.org/10.48550/arXiv.1612.03928).
[8] Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 5769-5779.
[9] Alexander Toet. The tno multiband image data collection[J]. Journal Data in Brief, 2017, 15: 249-251. DOI: 10.1016/j.dib.2017.09.038
[10] MA J, YU W, LIANG P, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26. DOI: 10.1016/j.inffus.2018.09.004
[11] Burt P, Adelson E. The Laplacian pyramid as a compact image code[J]. IEEE Transactions on Communications, 1983, 31(4): 532-540. DOI: 10.1109/TCOM.1983.1095851
[12] Toet A. Image fusion by a ratio of low-pass pyramid[J]. Pattern Recognition Letters, 1989, 9(4): 245-253. DOI: 10.1016/0167-8655(89)90003-2
[13] Lewis J J, O'Callaghan R J, Nikolov S G, et al. Pixel-and region-based image fusion with complex wavelets[J]. Information Fusion, 2007, 8(2): 119-130. DOI: 10.1016/j.inffus.2005.09.006
[14] Chipman L J, Orr T M, Graham L N. Wavelets and Image Fusion[C]// International Conference on Image Processing of IEEE, 1995: 248-251.
[15] Nencini F, Garzelli A, Baronti S, et al. Remote sensing image fusion using the curvelet transform[J]. Information Fusion, 2007, 8(2): 143-156. DOI: 10.1016/j.inffus.2006.02.001
[16] Adu J, GAN J, WANG Y, et al. Image fusion based on nonsubsampled contourlet transform for infrared and visible light image[J]. Infrared Physics & Technology, 2013, 61: 94-100.
[17] Roberts J W, Van Aardt J A, Ahmed F B. Assessment of image fusion procedures using entropy, image quality, and multispectral classification[J]. Journal of Applied Remote Sensing, 2008, 2(1): 023522. DOI: 10.1117/1.2945910
[18] SHI W, ZHU C, TIAN Y, et al. Wavelet-based image fusion and quality assessment[J]. International Journal of Applied Earth Observation and Geoinformation, 2005, 6(3-4): 241-251. DOI: 10.1016/j.jag.2004.10.010
[19] QU G, ZHANG D, YAN P. Information measure for performance of image fusion[J]. Electronics Letters, 2002, 38(7): 313-315.
[20] HE L I, LEI L, CHAO Y, et al. An improved fusion algorithm for infrared and visible images based on multi-scale transform[J]. Semiconductor Optoelectronics, 2016, 74: 28-37.
[21] MA J, YU W, LIANG P, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26.
-
期刊类型引用(4)
1. 李宝库,柳乐,徐伟,曾文彬,胡海飞,闫锋,蔡盛. 红外系统自身热辐射导致的分布式探测距离变化分析. 红外与激光工程. 2023(03): 104-114 . 百度学术
2. 孔祥悦,贺融,栗洋洋,彭晴晴,杨加强,杜晓宇. 基于可见光系统的杂散光评价方法研究. 激光与红外. 2023(09): 1415-1420 . 百度学术
3. 栗洋洋,彭晴晴,刘纪洲. 透射式红外光学系统的太阳杂散辐射分析. 激光与红外. 2021(08): 1025-1030 . 百度学术
4. 费国标. 一种红外成像镜头结构设计及分析. 科学技术创新. 2020(15): 188-189 . 百度学术
其他类型引用(14)