CNN联合多尺度Transformer的高光谱与多光谱图像融合

徐光宪; 周伟杰; 马飞

CNN联合多尺度Transformer的高光谱与多光谱图像融合

辽宁工程技术大学电子与信息工程学院, 辽宁葫芦岛 125105

基金项目:

辽宁省科技厅应用基础研究项目 101300274

辽宁省教育厅研究项目 LJKZ0357

详细信息

作者简介:
徐光宪（1977-），男，教授，研究方向为网络编码与信息处理。E-mail：5261009@qq.com

中图分类号: TP391
计量
- 文章访问数: 101
- HTML全文浏览量: 46
- PDF下载量: 45
出版历程
- 收稿日期: 2023-07-20
- 修回日期: 2023-11-06
- 刊出日期: 2025-01-19

Fusion of Hyperspectral and Multispectral Images Using a CNN Joint Multi-Scale Transformer

School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China

摘要

摘要:
高光谱图像具有丰富的光谱信息，多光谱图像具有精妙的几何特征，融合高分辨率的多光谱图像和低分辨率的高光谱图像可以获取更为全面的遥感数据图像。然而现有的融合网络大多数基于卷积神经网络所设计，对于结构复杂的遥感类图像而言，依赖于核大小的卷积运算，容易导致特征融合阶段缺乏一些全局上下文信息。为保证图像融合的质量，本文提出了一种CNN（Convolutional Neural Network, CNN）联合多尺度transformer网络来实现多光谱和高光谱图像融合，结合了CNN的特征提取能力与transformer的全局建模优势。网络将融合任务分为了两个阶段，特征提取阶段和融合阶段。特征提取阶段，针对图像特性，基于卷积神经网络分别设计了不同模块用于特征提取。融合阶段，通过多尺度transformer模块从局部到全局建立信息间长距离关联，最后通过多层卷积层将特征映射为高分辨率的高光谱图像。经过在CAVE和Harvard数据集的实验结果表明，本文所提算法与其他经典算法相比，能更好地提升融合图像的质量。
- 高光谱图像 /
- 多光谱图像 /
- 卷积神经网络 /
- transformer /
- 图像融合
Abstract:
Hyperspectral images contain rich spectral information, and multispectral images have exquisite geometric features. More comprehensive remote sensing images can be obtained by merging high-resolution multispectral and low-resolution hyperspectral images. However, most existing fusion networks are based on convolutional neural networks. For remote sensing images with complex structures, convolution operations dependent on the kernel size tend to lead to a lack of global context information in the feature fusion stage. To ensure the quality of image fusion, this study proposes a convolutional neural network (CNN) combined with a multi-scale transformer network to realize multispectral and hyperspectral image fusion, combining the feature extraction capability of the CNN and the global modeling advantage of the transformer. The network divides the fusion task into two stages: feature extraction and fusion. In the feature extraction stage, different modules are designed for feature extraction based on the CNN. In the fusion stage, a multi-scale transformer module is used to establish a long-distance correlation between local and global information, and the features are mapped into high-resolution hyperspectral images through multilayer convolution layers. Experimental results on the CAVE and Harvard datasets show that the proposed algorithm can improve the quality of fused images better than other classical algorithms.
- hyperspectral image /
- multispectral image /
- CNN /
- transformer /
- image fusion

HTML全文

图 1 CNN联合多尺度transformer高光谱融合超分辨网络示意图

Figure 1. Joint CNN and multi-scale transformer hyperspectral fusion super-resolution network

下载: 全尺寸图片幻灯片

图 2 多层次卷积模块

Figure 2. The multilevel convolutional module

下载: 全尺寸图片幻灯片

图 3 残差通道注意力模块

Figure 3. The residual channel attention module

下载: 全尺寸图片幻灯片

图 4 多尺度transformer模块

Figure 4. The multi-scale transformer module

下载: 全尺寸图片幻灯片

图 5 Swin transformer模块

Figure 5. Swin transformer layer

下载: 全尺寸图片幻灯片

图 6 CAVE数据集12张测试图片的PSNR值

Figure 6. PSNR values of 12 test images in the CAVE dataset

下载: 全尺寸图片幻灯片

图 7 CAVE数据集重建结果与误差图

Figure 7. CAVE dataset reconstruction results and error images

下载: 全尺寸图片幻灯片

图 8 Harvard数据集20张测试图片的PSNR值

Figure 8. PSNR values of 12 test images in the Harvard dataset

下载: 全尺寸图片幻灯片

图 9 Harvard数据集重建结果与误差图

Figure 9. Harvard dataset reconstruction results and error images

下载: 全尺寸图片幻灯片

图 10 PU数据集9张测试图片的PSNR值

Figure 10. PSNR values of 9 test images in the PU dataset

下载: 全尺寸图片幻灯片

图 11 PU数据集重建结果与误差图

Figure 11. PU dataset reconstruction results and error images

下载: 全尺寸图片幻灯片

表 1 不同方法在CAVE数据集上的融合结果

Table 1 Fusion results of different methods on CAVE data set

Methods	CAVE
Methods	PSNR	SAM	ERGAS	SSIM
CSTF	45.14	4.4	1.44	0.9873
NLSTF	47.69	3.47	1.22	0.9887
NLSTF-SMBF	42.82	6.12	2.06	0.9818
SSRNET	44.41	2.60	1.40	0.9928
ResTFNet	45.99	2.19	1.07	0.9947
MHF-Net	48.40	2.01	0.85	0.9973
HSRnet	48.82	1.86	0.79	0.9978
Ours	49.57	1.77	0.67	0.9986
Note: Red font represents the best; Blue font represents suboptimal.

下载: 导出CSV

表 2 不同方法在Harvard数据集上的融合结果

Table 2 Fusion results of different methods on Harvard dataset

Methods	Harvard
Methods	PSNR	SAM	ERGAS	SSIM
CSTF	45.43	3.20	2.14	0.9814
NLSTF	46.30	3.05	1.91	0.9826
NLSTF-SMBF	45.52	3.43	2.04	0.9819
SSRNET	46.25	2.26	1.56	0.9939
ResTFNet	46.89	1.98	1.42	0.9946
MHF-Net	47.62	1.79	1.18	0.9965
HSRnet	48.28	1.69	1.09	0.9972
Ours	48.83	1.58	0.94	0.9978
Note: Red font represents the best; Blue font represents suboptimal.

下载: 导出CSV

表 3 不同方法在PU数据集上的融合结果

Table 3 Fusion results of different methods on PU data set

Methods	PU (Pavia University)
Methods	PSNR	SAM	ERGAS	SSIM
CSTF	45.87	2.10	1.25	0.9824
NLSTF	46.41	1.97	1.20	0.9831
NLSTF-SMBF	45.23	2.12	1.34	0.9809
SSRNET	47.25	1.43	1.16	0.9918
ResTFNet	48.44	1.98	0.90	0.9940
MHF-Net	48.83	1.40	0.885	0.9943
HSRnet	49.18	1.22	0.74	0.9954
Ours	50.08	0.989	0.56	0.9963
Note: Red font represents the best; Blue font represents suboptimal.

下载: 导出CSV

表 4 网络模块有效性分析

Table 4 Analysis of the effectiveness of network modules

Feature Extractor	Fusion network	PSNR/dB	SAM	ERGAS	SSIM
√	×（multi-scale Transformer）	46.48	2.13	1.05	0.9959
×	√（multi-scale Transformer）	48.52	1.94	0.89	0.9978
√	√（Fusformer）	48.76	1.89	0.82	0.9981
√	√（multi-scale Transformer）	49.57	1.77	0.67	0.9986

下载: 导出CSV

参考文献(31)

[1]	YANG Q, XU Y, WU Z, et al. Hyperspectral And Multispectral Image Fusion Based On Deep Attention Network[C]//Proceedings of the 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), 2019: 1-5.
[2]	童庆禧, 张兵, 张立福. 中国高光谱遥感的前沿进展[J]. 遥感学报, 2016, 20(5): 689-707. TONG Qingxi, ZHANG Bing, ZHANG Lifu. Current progress of hyperspectral remote sensing in China[J]. Journal of Remote Sensing, 2016, 20(5): 689-707.
[3]	Akbari H, Kosugi Y, Kojima K, et al. Detection and analysis of the intestinal ischemia using visible and invisible hyperspectral imaging[J]. IEEE Transactions on Biomedical Engineering, 2010, 57(8): 2011-2017. DOI: 10.1109/TBME.2010.2049110
[4]	Zhihong P, Healey G, Prasad M, et al. Face recognition in hyperspectral images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1552-1560. DOI: 10.1109/TPAMI.2003.1251148
[5]	DIAN R, LI S, SUN B, et al. Recent advances and new guidelines on hyperspectral and multispectral image fusion[J]. Information Fusion, 2021, 69: 40-51. DOI: 10.1016/j.inffus.2020.11.001
[6]	Fasbender D, Radoux J, Bogaert P. Bayesian data fusion for adaptable image pansharpening[J]. IEEE Transactions on Geoscience and Remote Sensing, 2008, 46(6): 1847-1857. DOI: 10.1109/TGRS.2008.917131
[7]	CHEN Z, PU H, WANG B, et al. Fusion of hyperspectral and multispectral images: a novel framework based on generalization of pan-sharpening methods[J]. IEEE Geoscience and Remote Sensing Letters, 2014, 11(8): 1418-1422. DOI: 10.1109/LGRS.2013.2294476
[8]	SELVA M, AIAZZI B, BUTERA F, et al. Hyper-sharpening of hyperspectral data: A first approach[C]//Proceedings of the 2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), 2014: 24-27.
[9]	Zurita-Milla R, Clevers J G P W, Schaepman M E. Unmixing-based landsat TM and MERIS FR data fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2008, 5(3): 453-457. DOI: 10.1109/LGRS.2008.919685
[10]	Yokoya N, Yairi T, Iwasaki A. Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2012, 50(2): 528-537. DOI: 10.1109/TGRS.2011.2161320
[11]	Lanaras C, Baltsavias E, Schindler K. Hyperspectral super-resolution by coupled spectral unmixing[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 2015: 3586-3594.
[12]	WEI Q, Dobigeon N, Tourneret J Y. Fast fusion of multi-band images based on solving a sylvester equation[J]. IEEE Transactions on Image Processing, 2015, 24(11): 4109-4121. DOI: 10.1109/TIP.2015.2458572
[13]	Akhtar N, Shafait F, Mian A. Bayesian sparse representation for hyperspectral image super resolution[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 7-12.
[14]	孙佳敏, 宋慧慧. 基于DWT和生成对抗网络的高光谱多光谱图像融合[J]. 无线电工程, 2021, 51(12): 1434-1441. SUN Jiamin, SONG Huihui. Hyperspectral and multispectral image fusion based on discrete wavelet transform and fenerative adversarial networks[J]. Radio Engineering, 2021, 51(12): 1434-1441.
[15]	DIAN R, FANG L, LI S. Hyperspectral image super-resolution via non-local sparse tensor factorization[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 21-26.
[16]	DIAN R, LI S, FANG L, et al. Nonlocal sparse tensor factorization for semiblind hyperspectral and multispectral image fusion[J]. IEEE Transactions on Cybernetics, 2020, 50(10): 4469-4480. DOI: 10.1109/TCYB.2019.2951572
[17]	LI S, DIAN R, FANG L, et al. Fusing hyperspectral and multispectral images via coupled sparse tensor factorization[J]. IEEE Transactions on Image Processing, 2018, 27(8): 4118-4130. DOI: 10.1109/TIP.2018.2836307
[18]	Kanatsoulis C I, Fu X, Sidiropoulos N D, et al. Hyperspectral super-resolution: a coupled tensor factorization approach[J]. IEEE Transactions on Signal Processing, 2018, 66(24): 6503-6517. DOI: 10.1109/TSP.2018.2876362
[19]	LI J, ZHENG K, YAO J, et al. Deep unsupervised blind hyperspectral and multispectral data fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
[20]	WANG X, WANG X, ZHAO K, et al. FSL-Unet: full-scale linked unet with spatial–spectral joint perceptual attention for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-14.
[21]	DONG M, LI W, LIANG X, et al. MDCNN: multispectral pansharpening based on a multiscale dilated convolutional neural network[J]. Journal of Applied Remote Sensing, 2021, 15: 036516.
[22]	Benzenati T, Kessentini Y, Kallel A. Pansharpening approach via two-stream detail injection based on relativistic generative adversarial networks[J]. Expert Systems with Applications, 2022, 188: 115996. DOI: 10.1016/j.eswa.2021.115996
[23]	FU Y, XU T, WU X, et al. PPT Fusion: pyramid patch transformerfor a case study in image fusion[J]. arXiv preprint arXiv: 2107.13967, 2021.
[24]	GAO S H, CHENG M M, ZHAO K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662. DOI: 10.1109/TPAMI.2019.2938758
[25]	LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 10-17.
[26]	Kingma D P, BA J. Adam: a method for stochastic optimization[J/OL]. arXiv: 1412.6980, 2014, https://doi.org/10.48550/arXiv.1412.6980.
[27]	ZHANG X, HUANG W, WANG Q, et al. SSR-NET: spatial–spectral reconstruction network for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(7): 5953-5965. DOI: 10.1109/TGRS.2020.3018732
[28]	LIU X, LIU Q, WANG Y. Remote sensing image fusion based on two-stream fusion network[J]. Information Fusion, 2020, 55: 1-15. DOI: 10.1016/j.inffus.2019.07.010
[29]	XIE Q, ZHOU M, ZHAO Q, et al. MHF-Net: an interpretable deep network for multispectral and hyperspectral image fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1457-1473. DOI: 10.1109/TPAMI.2020.3015691
[30]	HU J F, HUANG T Z, DENG L J, et al. Hyperspectral image super-resolution via deep spatiospectral attention convolutional neural networks [J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 7251-7265. DOI: 10.1109/TNNLS.2021.3084682
[31]	HU J F, HUANG T Z, DENG L J, et al. Fusformer: a transformer-based fusion network for hyperspectral image super-resolution[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.

施引文献(30)

期刊类型引用(6)

1.	胡奕彬，周鑫，沈爱国，胡樾. 基于改进小波变换的图像去噪与融合. 齐齐哈尔大学学报(自然科学版). 2023(03): 33-38 . 百度学术
2.	王丽芳，米嘉，秦品乐，蔺素珍，高媛，刘阳. 改进U-Net3+与跨模态注意力块的医学图像融合. 中国图象图形学报. 2022(12): 3622-3636 . 百度学术
3.	马金磊，徐永强，李雷，余爱国，孟浩. 基于随机游走算法的红外与可见光图像融合. 电光与控制. 2021(10): 90-93+98 . 百度学术
4.	陈卓，方明，柴旭，付飞蚺，苑丽红. 红外与可见光图像融合的U-GAN模型. 西北工业大学学报. 2020(04): 904-912 . 百度学术
5.	许亚男，钱叶旺，王鞠庭. 结合K-SVD算法的金字塔变换域中的图像融合方法研究. 荆楚理工学院学报. 2020(05): 68-72+81 . 百度学术
6.	白玉，侯志强，刘晓义，马素刚，余旺盛，蒲磊. 基于可见光图像和红外图像决策级融合的目标检测算法. 空军工程大学学报(自然科学版). 2020(06): 53-59+100 . 百度学术