红外与可见光图像注意力生成对抗融合方法研究

武圆圆; 王志社; 王君尧; 邵文禹; 陈彦林

红外与可见光图像注意力生成对抗融合方法研究

太原科技大学应用科学学院, 山西太原 030024

基金项目:

山西省面上自然基金项目 201901D111260

信息探测与处理山西省重点实验室开放研究基金 ISTP2020-4

太原科技大学博士启动基金 20162004

详细信息

作者简介:
武圆圆（1997-）女，硕士研究生，研究方向为光学测控技术与应用。E-mail：yywu321@163.com

通讯作者:
王志社（1982-）男，副教授，博士，研究方向为红外图像处理、机器学习和信息融合。E-mail：wangzs@tyust.edu.cn

中图分类号: TP391.4
计量
- 文章访问数: 187
- HTML全文浏览量: 60
- PDF下载量: 69
出版历程
- 收稿日期: 2021-05-28
- 修回日期: 2021-07-19
- 刊出日期: 2022-02-19

Infrared and Visible Image Fusion Using Attention- Based Generative Adversarial Networks

School of Applied Science, Taiyuan University of Science and Technology, Taiyuan 030024, China

摘要

摘要: 目前，基于深度学习的融合方法依赖卷积核提取局部特征，而单尺度网络、卷积核大小以及网络深度的限制无法满足图像的多尺度与全局特性。为此，本文提出了红外与可见光图像注意力生成对抗融合方法。该方法采用编码器和解码器构成的生成器以及两个判别器。在编码器中设计了多尺度模块与通道自注意力机制，可以有效提取多尺度特征，并建立特征通道长距离依赖关系，增强了多尺度特征的全局特性。此外，构建了两个判别器，以建立生成图像与源图像之间的对抗关系，保留更多细节信息。实验结果表明，本文方法在主客观评价上都优于其他典型方法。
- 图像融合 /
- 通道自注意力机制 /
- 深度学习 /
- 生成对抗网络 /
- 红外图像 /
- 可见光图像
Abstract: At present, deep learning-based fusion methods rely only on convolutional kernels to extract local features, but the limitations of single-scale networks, convolutional kernel size, and network depth cannot provide a sufficient number of multi-scale and global image characteristics. Therefore, here we propose an infrared and visible image fusion method using attention-based generative adversarial networks. This study uses a generator consisting of an encoder and decoder, and two discriminators. The multi-scale module and channel self-attention mechanism are designed in the encoder, which can effectively extract multi-scale features and establish the dependency between the long ranges of feature channels, thus enhancing the global characteristics of multi-scale features. In addition, two discriminators are constructed to establish an adversarial relationship between the fused image and the source images to preserve more detailed information. The experimental results demonstrate that the proposed method is superior to other typical methods in both subjective and objective evaluations.
- image fusion /
- channel self-attention mechanism /
- deep learning /
- generative adversarial networks /
- infrared image /
- visible image

HTML全文

图 1 本文方法网络结构

Figure 1. The network architecture of our method

下载: 全尺寸图片幻灯片

图 2 Res2Net结构

Figure 2. The architecture of Res2Net

下载: 全尺寸图片幻灯片

图 3 通道自注意力结构

Figure 3. The architecture of channel-self-attention

下载: 全尺寸图片幻灯片

图 4 “Nato_camp”实验结果

Figure 4. The experimental results of "Nato_camp"

下载: 全尺寸图片幻灯片

图 5 “helicopter”实验结果

Figure 5. The experimental results of "helicopter"

下载: 全尺寸图片幻灯片

图 6 “bench”实验结果

Figure 6. The experimental results of "bench"

下载: 全尺寸图片幻灯片

图 7 “Movie_18”实验结果

Figure 7. The experimental results of "Movie_18"

下载: 全尺寸图片幻灯片

图 8 “TNO”数据集定量评价指标

Figure 8. Quantitative comparisons of on "TNO" dataset

下载: 全尺寸图片幻灯片

图 9 “example 1”实验结果

Figure 9. The experimental results of "example 1"

下载: 全尺寸图片幻灯片

图 10 “example 2”实验结果

Figure 10. The experimental results of "example 2"

下载: 全尺寸图片幻灯片

图 11 “Roadscene”数据集定量评价指标

Figure 11. Quantitative comparisons of on "Roadscene" dataset

下载: 全尺寸图片幻灯片

表 1 生成器参数设置

Table 1 Parameter setting of the generator

Parts	Layer	Kernel size/stride	Input channel/ Output channel, activation
Encoder	C0	3×3/1	1/16, LeakyReLU
	Res2Net1	-	16/32, LeakyReLU
	Res2Net2	-	32/64, LeakyReLU
Decoder	C1	3×3/1	128/64, LeakyReLU
	C2	3×3/1	64/32, LeakyReLU
	C3	3×3/1	32/16, LeakyReLU
	C4	3×3/1	16/1, Tanh

下载: 导出CSV

表 2 判别器参数设置

Table 2 Parameter setting of the discriminators

Layer	Kernel size/stride	Input channel/Output channel, activation
L1	3×3/2	1/16, LeakyReLU
L2	3×3/2	16/32, LeakyReLU
L3	3×3/2	32/64, LeakyReLU
L4	3×3/2	64/128, LeakyReLU
L5(FC(1))	-	128/1, Tanh

下载: 导出CSV

表 3 消融实验的定量比较

Table 3 Quantitative analysis of ablation experiment

Methods	EN	SD	CC	SCD	MS-SSIM	VIFF
No-CA	7.2439	42.6525	0.6305	1.7705	0.9221	0.5149
No-Res2Net	7.3372	46.2277	0.6295	1.8453	0.9227	0.5572
Ours	7.3596	46.9659	0.6290	1.8494	0.9278	0.5683

下载: 导出CSV

表 4 时间计算率比较

Table 4 Comparison of time efficiency

Methods	TNO	Roadscene
CVT	1.33	0.92
ASR	332.21	165.23
WLS	2.23	1.17
DenseFuse	0.11	0.08
FusionGan	1.98	1.02
IFCNN	0.08	0.07
Ours	0.23	0.19

下载: 导出CSV

参考文献(26)

[1]	MA J, MA Y, LI C. Infrared and visible image fusion methods and applications: a survey[J]. Information Fusion, 2019, 45: 153-178. DOI: 10.1016/j.inffus.2018.02.004
[2]	LI S, KANG X, FANG L, et al. Pixel-level image fusion: a survey of the state of the art[J]. Information Fusion, 2017, 33: 100-112. DOI: 10.1016/j.inffus.2016.05.004
[3]	LIU Y, CHEN X, WANG Z, et al. Deep learning for pixel-level image fusion: Recent advances and future prospects[J]. Information Fusion, 2018, 42: 158-173. DOI: 10.1016/j.inffus.2017.10.007
[4]	LI S, YANG B, HU J. Performance comparison of different multi-resolution transforms for image fusion[J]. Information Fusion, 2011, 12(2): 74-84. DOI: 10.1016/j.inffus.2010.03.002
[5]	ZHANG Q, LIU Y, Rick S Blum, et al. Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: a review[J]. Information Fusion, 2018, 40: 57-75. DOI: 10.1016/j.inffus.2017.05.006
[6]	ZHANG Xiaoye, MA Yong, ZHANG Ying, et al. Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition[J]. Journal of the Optical Society of America A Optics Image Science & Vision, 2017, 34(8): 1400-1410.
[7]	YU L, LIU S, WANG Z. A general framework for image fusion based on multi-scale transform and sparse representation[J]. Information Fusion, 2015, 24: 147-164. DOI: 10.1016/j.inffus.2014.09.004
[8]	HAN J, Pauwels E J, P De Zeeuw. Fast saliency-aware multimodality image fusion[J]. Neurocomputing, 2013, 111: 70-80. DOI: 10.1016/j.neucom.2012.12.015
[9]	YIN Haitao. Sparse representation with learned multiscale dictionary for image fusion[J]. Neurocomputing, 2015, 148: 600-610. DOI: 10.1016/j.neucom.2014.07.003
[10]	WANG Zhishe, YANG Fengbao, PENG Zhihao, et al. Multi-sensor image enhanced fusion algorithm based on NSST and top-hat transformation[J]. Optik-International Journal for Light and Electron Optics, 2015, 126(23): 4184-4190. DOI: 10.1016/j.ijleo.2015.08.118
[11]	CUI G, FENG H, XU Z, et al. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition[J]. Optics Communications, 2015, 341: 199-209. DOI: 10.1016/j.optcom.2014.12.032
[12]	LI Q, LU L, LI Z, et al. Coupled GAN with relativistic discriminators for infrared and visible images fusion[J]. IEEE Sensors Journal, 2021, 21(6): 7458-7467. DOI: 10.1109/JSEN.2019.2921803
[13]	LIU Y, CHEN X, CHENG J, et al. Infrared and visible image fusion with convolutional neural networks[J]. International Journal of Wavelets, Multiresolution and Information Processing, 2018, 16(3): 1850018. DOI: 10.1142/S0219691318500182
[14]	LI H, WU X J. DenseFuse: a fusion approach to infrared and visible images[J]. IEEE transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2019, 28(5): 2614-2523. DOI: 10.1109/TIP.2018.2887342
[15]	XU H, MA J, JIANG J, et al. U2Fusion: A unified unsupervised image fusion network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(1): 502-518.
[16]	HOU R. VIF-Net: an unsupervised framework for infrared and visible image fusion[J]. IEEE Transactions on Computational Imaging, 2020, 6: 640-651. DOI: 10.1109/TCI.2020.2965304
[17]	HUI L A, XJW A, JK B. RFN-Nest: An end-to-end residual fusion network for infrared and visible images[J]. Information Fusion, 2021, 73: 72-86. DOI: 10.1016/j.inffus.2021.02.023
[18]	MA J, WEI Y, LIANG P, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26. DOI: 10.1016/j.inffus.2018.09.004
[19]	JM A, Pl A, WEI Y A, et al. Infrared and visible image fusion via detail preserving adversarial learning[J]. Information Fusion, 2020, 54: 85-98. DOI: 10.1016/j.inffus.2019.07.005
[20]	MA J, XU H, JIANG J, et al. DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion[J]. IEEE Transactions on Image Processing, 2020, 29: 4980-4995. DOI: 10.1109/TIP.2020.2977573
[21]	GAO S, CHENG M M, ZHAO K, et al. Res2Net: A new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662. DOI: 10.1109/TPAMI.2019.2938758
[22]	FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2020: DOI: 10.1109/cvpr.2019.00326.
[23]	Nencini F, Garzelli A, Baronti S, et al. Alparone, remote sensing image fusion using the curvelet transform[J]. Information Fusion, 2007, 8(2): 143-156. DOI: 10.1016/j.inffus.2006.02.001
[24]	LIU Y, WANG Z. Simultaneous image fusion and denoising with adaptive sparse representation[J]. Image Processing Iet. , 2014, 9(5): 347-357.
[25]	MA J, ZHOU Z, WANG B, et al. Infrared and visible image fusion based on visual saliency map and weighted least square optimization[J]. Infrared Physics & Technology, 2017, 82: 8-17.
[26]	YU Z A, YU L B, PENG S C, et al. IFCNN: A general image fusion framework based on convolutional neural network[J]. Information Fusion, 2020, 54: 99-118. DOI: 10.1016/j.inffus.2019.07.011

施引文献

资源附件(0)

图(11) / 表(4)

计量

文章访问数: 187
HTML全文浏览量: 60
PDF下载量: 69
被引次数: 0

红外与可见光图像注意力生成对抗融合方法研究

作者简介: 武圆圆（1997-）女，硕士研究生，研究方向为光学测控技术与应用。E-mail：yywu321@163.com

通讯作者: 王志社（1982-）男，副教授，博士，研究方向为红外图像处理、机器学习和信息融合。E-mail：wangzs@tyust.edu.cn

计量

出版历程

Infrared and Visible Image Fusion Using Attention- Based Generative Adversarial Networks

计量

出版历程

目录

作者简介:
武圆圆（1997-）女，硕士研究生，研究方向为光学测控技术与应用。E-mail：yywu321@163.com

通讯作者:
王志社（1982-）男，副教授，博士，研究方向为红外图像处理、机器学习和信息融合。E-mail：wangzs@tyust.edu.cn