自注意力引导的红外与可见光图像融合算法

王天元; 罗晓清; 张战成

自注意力引导的红外与可见光图像融合算法

1.
江南大学人工智能与计算机学院, 江苏无锡 214122
2.
苏州科技大学电子与信息工程学院, 江苏苏州 215009

基金项目:

国家自然科学基金 61772237

江苏省六大人才高峰项目 XYDXX-030

详细信息

作者简介:
王天元（1996-），男，硕士研究生。主要研究方向：图像融合

通讯作者:
罗晓清（1980-），女，博士研究生/副教授，主要研究方向：图像融合。E-mail: xqluo@jiangnan.edu.cn

中图分类号: TP391
计量
- 文章访问数: 148
- HTML全文浏览量: 42
- PDF下载量: 36
- 被引次数: 0
出版历程
- 收稿日期: 2021-03-06
- 修回日期: 2021-08-22
- 刊出日期: 2023-02-20

Infrared and Visible Image Fusion Based on Self-attention Learning

1.
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
2.
School of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China

摘要

摘要: 为确保源图像中的显著区域在融合图像保持显著，提出了一种自注意力引导的红外与可见光图像融合方法。在特征学习层引入自注意力学习机制获取源图像的特征图和自注意力图，利用自注意力图可以捕获到图像中长距离依赖的特性，设计平均加权融合策略对源图像的特征图进行融合，最后将融合后的特征图进行重构获得融合图像。通过生成对抗网络实现了图像特征编码、自注意力学习、融合规则和融合特征解码的学习。TNO真实数据上的实验表明，学习到注意力单元体现了图像中显著的区域，能够较好地引导融合规则的生成，提出的算法在客观和主观评价上优于当前主流红外与可见光图像融合算法，较好地保留了可见光图像的细节信息和红外图像的红外目标信息。
- 图像融合 /
- 自注意力 /
- 生成对抗网络 /
- 红外图像 /
- 深度学习
Abstract: Due to the lack of image saliency preserving in the existing fusion rules, a self-attention-guided infrared and visible light image fusion method is proposed. First, the feature map and self-attention map of the source images are learnt by the self-attention learning mechanism in the feature learning layer. Next, the self-attention map which can capture the long-distance dependent characteristics of the image is used to design average weighted fusion strategy. Finally, the fused feature maps are reconstructed to obtain the fused image, and the learning of image feature coding, self-attention mechanism, fusion rule, and fused feature decoding are realized by generative adversarial network. Experiments on TNO real-world data show that the learned self-attention unit can represent the salient region and benefit the fusion rule design, the proposed algorithm is better than SOAT infrared and visible image fusion algorithms in objective and subjective evaluation, and it retains the detailed information of visible images and infrared target information of infrared images.
- image fusion /
- self-attention /
- generative adversarial network /
- infrared images /
- deep learning

HTML全文

图 1 生成器的网络结构

Figure 1. Network architecture of generator

下载: 全尺寸图片幻灯片

图 2 鉴别器的网络结构

Figure 2. Network architecture of discriminator

下载: 全尺寸图片幻灯片

图 3 融合层框架

Figure 3. Fusion layer framework

下载: 全尺寸图片幻灯片

图 4 场景1融合效果

Figure 4. Image fusion results (Scene 1)

下载: 全尺寸图片幻灯片

图 5 场景2融合效果

Figure 5. Image fusion results (Scene 2)

下载: 全尺寸图片幻灯片

图 6 TNO数据集上自注意力模块的可视化结果

Figure 6. Visualization results of the self-attention module on TNO dataset

下载: 全尺寸图片幻灯片

表 1 不同融合方法的客观评价结果

Table 1. Objective evaluation results of different fusion methods

Method	EN	SD	CC	SF	MG	EI
CSR	6.52	28.30	0.51	17.25	8.02	53.69
GTF	6.80	39.33	0.33	15.32	7.72	48.29
DenseFuse	6.93	37.50	0.53	16.24	8.29	53.48
IFCNN	6.84	35.65	0.48	20.38	10.43	67.09
FusionGAN	6.75	33.59	0.43	10.76	5.57	39.83
DDcGAN	7.45	51.98	0.26	17.02	8.72	62.09
SEDRFuse	6.99	42.49	0.51	14.83	7.21	56.37
Ours	7.29	45.20	0.52	25.11	13.02	93.30

下载: 导出CSV

参考文献(20)

[1]	MA J Y, MA Y, LI C. Infrared and visible image fusion methods and applications: a survey[J]. Information Fusion, 2019, 45: 153-178. doi: 10.1016/j.inffus.2018.02.004
[2]	YU X C, GAO G Y, XU J D, et al. Remote sensing image fusion based on sparse representation[C]//2014 IEEE Geoscience and Remote Sensing Symposium, 2014: 2858-2861.
[3]	ZHAO W D, LU H C. Medical image fusion and denoising with alternating sequential filter and adaptive fractional order total variation[J]. IEEE Transactions on Instrumentation and Measurement, 2017, 66(9): 2283-2294. doi: 10.1109/TIM.2017.2700198
[4]	LI Y S, TAO C, et al. Unsupervised multilayer feature learning for satellite image scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(2): 157-161. doi: 10.1109/LGRS.2015.2503142
[5]	JIN X, JIANG Q, et al. A survey of infrared and visual image fusion methods[J]. Information Fusion, 2017, 85: 478-501.
[6]	BAI X, ZHANG Y, ZHOU F, et al. Quadtree-based multi-focus image fusion using a weighted focus-measure[J]. Information Fusion, 2015, 22: 105-118. doi: 10.1016/j.inffus.2014.05.003
[7]	BAI X Z. Infrared and visual image fusion through feature extraction by morphological sequential toggle operator[J]. Information Fusion, 2015, 71: 77-86.
[8]	LIU Y, CHEN X, PENG H, et al. Multi-focus image fusion with a deep convolutional neural network[J]. Information Fusion, 2017, 36: 191-207. doi: 10.1016/j.inffus.2016.12.001
[9]	MA J Y, YU W, LIANG P W, et al. FusionGAN: a generative adversarial network for infrared and visible image fusion[J]. Information Fusion, 2019, 48: 11-26. doi: 10.1016/j.inffus.2018.09.004
[10]	LI H, WU X J. DenseFuse: a fusion approach to infrared and visible images[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2614-2623. doi: 10.1109/TIP.2018.2887342
[11]	WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Computer Vision and Pattern Recognition, 2018: 7794-7803.
[12]	ZHANG H, GOODFELLOW I, Metaxas D, et al. Self-attention generative adversarial networks[C]//International Conference on Machine Learning, 2020: 7354-7363.
[13]	杨晓莉, 蔺素珍. 一种注意力机制的多波段图像特征级融合方法[J]. 西安电子科技大学学报, 2020, 47(1): 123-130. https://www.cnki.com.cn/Article/CJFDTOTAL-XDKD202001018.htm YANG X L, LIN S Z. Method for multi-band image feature-level fusion based on attention mechanism[J]. Journal of Xidian University, 2020, 47(1): 123-130. https://www.cnki.com.cn/Article/CJFDTOTAL-XDKD202001018.htm
[14]	JIAN L, YANG X, LIU Z, et al. A symmetric encoder-decoder with residual block for infrared and visible image fusion[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1-15.
[15]	LOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2015, 37: 448-456.
[16]	ZHANG Y, LIU Y, SUN P, et al. IFCNN: a general image fusion framework based on convolutional neural network [J]. Information Fusion, 2020, 54: 99-118. doi: 10.1016/j.inffus.2019.07.011
[17]	YAN H, YU X, et al. Single image depth estimation with normal guided scale invariant deep convolutional fields[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 29(1): 80-92.
[18]	LIU Y, CHEN X, WARD R, et al. Image fusion with convolutional sparse representation[J]. IEEE Signal Processing Letters, 2016, 23(12): 1882-1886. doi: 10.1109/LSP.2016.2618776
[19]	MA J Y, CHEN C, LI C, et al. Infrared and visible image fusion via gradient transfer and total variation minimization[J]. Information Fusion, 2016, 31: 100-109. doi: 10.1016/j.inffus.2016.02.001
[20]	MA J Y, XU H, JIANG J, et al. DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion [J]. IEEE Transactions on Image Processing, 2020, 29: 4980-4995. doi: 10.1109/TIP.2020.2977573