Fusion of Hyperspectral and Multispectral Images Using a CNN Joint Multi-Scale Transformer
-
摘要:
高光谱图像具有丰富的光谱信息,多光谱图像具有精妙的几何特征,融合高分辨率的多光谱图像和低分辨率的高光谱图像可以获取更为全面的遥感数据图像。然而现有的融合网络大多数基于卷积神经网络所设计,对于结构复杂的遥感类图像而言,依赖于核大小的卷积运算,容易导致特征融合阶段缺乏一些全局上下文信息。为保证图像融合的质量,本文提出了一种CNN(Convolutional Neural Network, CNN)联合多尺度transformer网络来实现多光谱和高光谱图像融合,结合了CNN的特征提取能力与transformer的全局建模优势。网络将融合任务分为了两个阶段,特征提取阶段和融合阶段。特征提取阶段,针对图像特性,基于卷积神经网络分别设计了不同模块用于特征提取。融合阶段,通过多尺度transformer模块从局部到全局建立信息间长距离关联,最后通过多层卷积层将特征映射为高分辨率的高光谱图像。经过在CAVE和Harvard数据集的实验结果表明,本文所提算法与其他经典算法相比,能更好地提升融合图像的质量。
-
关键词:
- 高光谱图像 /
- 多光谱图像 /
- 卷积神经网络 /
- transformer /
- 图像融合
Abstract:Hyperspectral images contain rich spectral information, and multispectral images have exquisite geometric features. More comprehensive remote sensing images can be obtained by merging high-resolution multispectral and low-resolution hyperspectral images. However, most existing fusion networks are based on convolutional neural networks. For remote sensing images with complex structures, convolution operations dependent on the kernel size tend to lead to a lack of global context information in the feature fusion stage. To ensure the quality of image fusion, this study proposes a convolutional neural network (CNN) combined with a multi-scale transformer network to realize multispectral and hyperspectral image fusion, combining the feature extraction capability of the CNN and the global modeling advantage of the transformer. The network divides the fusion task into two stages: feature extraction and fusion. In the feature extraction stage, different modules are designed for feature extraction based on the CNN. In the fusion stage, a multi-scale transformer module is used to establish a long-distance correlation between local and global information, and the features are mapped into high-resolution hyperspectral images through multilayer convolution layers. Experimental results on the CAVE and Harvard datasets show that the proposed algorithm can improve the quality of fused images better than other classical algorithms.
-
Keywords:
- hyperspectral image /
- multispectral image /
- CNN /
- transformer /
- image fusion
-
-
表 1 不同方法在CAVE数据集上的融合结果
Table 1 Fusion results of different methods on CAVE data set
Methods CAVE PSNR SAM ERGAS SSIM CSTF 45.14 4.4 1.44 0.9873 NLSTF 47.69 3.47 1.22 0.9887 NLSTF-SMBF 42.82 6.12 2.06 0.9818 SSRNET 44.41 2.60 1.40 0.9928 ResTFNet 45.99 2.19 1.07 0.9947 MHF-Net 48.40 2.01 0.85 0.9973 HSRnet 48.82 1.86 0.79 0.9978 Ours 49.57 1.77 0.67 0.9986 Note: Red font represents the best; Blue font represents suboptimal. 表 2 不同方法在Harvard数据集上的融合结果
Table 2 Fusion results of different methods on Harvard dataset
Methods Harvard PSNR SAM ERGAS SSIM CSTF 45.43 3.20 2.14 0.9814 NLSTF 46.30 3.05 1.91 0.9826 NLSTF-SMBF 45.52 3.43 2.04 0.9819 SSRNET 46.25 2.26 1.56 0.9939 ResTFNet 46.89 1.98 1.42 0.9946 MHF-Net 47.62 1.79 1.18 0.9965 HSRnet 48.28 1.69 1.09 0.9972 Ours 48.83 1.58 0.94 0.9978 Note: Red font represents the best; Blue font represents suboptimal. 表 3 不同方法在PU数据集上的融合结果
Table 3 Fusion results of different methods on PU data set
Methods PU (Pavia University) PSNR SAM ERGAS SSIM CSTF 45.87 2.10 1.25 0.9824 NLSTF 46.41 1.97 1.20 0.9831 NLSTF-SMBF 45.23 2.12 1.34 0.9809 SSRNET 47.25 1.43 1.16 0.9918 ResTFNet 48.44 1.98 0.90 0.9940 MHF-Net 48.83 1.40 0.885 0.9943 HSRnet 49.18 1.22 0.74 0.9954 Ours 50.08 0.989 0.56 0.9963 Note: Red font represents the best; Blue font represents suboptimal. 表 4 网络模块有效性分析
Table 4 Analysis of the effectiveness of network modules
Feature Extractor Fusion network PSNR/dB SAM ERGAS SSIM √ ×(multi-scale Transformer) 46.48 2.13 1.05 0.9959 × √(multi-scale Transformer) 48.52 1.94 0.89 0.9978 √ √(Fusformer) 48.76 1.89 0.82 0.9981 √ √(multi-scale Transformer) 49.57 1.77 0.67 0.9986 -
[1] YANG Q, XU Y, WU Z, et al. Hyperspectral And Multispectral Image Fusion Based On Deep Attention Network[C]//Proceedings of the 2019 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), 2019: 1-5.
[2] 童庆禧, 张兵, 张立福. 中国高光谱遥感的前沿进展[J]. 遥感学报, 2016, 20(5): 689-707. TONG Qingxi, ZHANG Bing, ZHANG Lifu. Current progress of hyperspectral remote sensing in China[J]. Journal of Remote Sensing, 2016, 20(5): 689-707.
[3] Akbari H, Kosugi Y, Kojima K, et al. Detection and analysis of the intestinal ischemia using visible and invisible hyperspectral imaging[J]. IEEE Transactions on Biomedical Engineering, 2010, 57(8): 2011-2017. DOI: 10.1109/TBME.2010.2049110
[4] Zhihong P, Healey G, Prasad M, et al. Face recognition in hyperspectral images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(12): 1552-1560. DOI: 10.1109/TPAMI.2003.1251148
[5] DIAN R, LI S, SUN B, et al. Recent advances and new guidelines on hyperspectral and multispectral image fusion[J]. Information Fusion, 2021, 69: 40-51. DOI: 10.1016/j.inffus.2020.11.001
[6] Fasbender D, Radoux J, Bogaert P. Bayesian data fusion for adaptable image pansharpening[J]. IEEE Transactions on Geoscience and Remote Sensing, 2008, 46(6): 1847-1857. DOI: 10.1109/TGRS.2008.917131
[7] CHEN Z, PU H, WANG B, et al. Fusion of hyperspectral and multispectral images: a novel framework based on generalization of pan-sharpening methods[J]. IEEE Geoscience and Remote Sensing Letters, 2014, 11(8): 1418-1422. DOI: 10.1109/LGRS.2013.2294476
[8] SELVA M, AIAZZI B, BUTERA F, et al. Hyper-sharpening of hyperspectral data: A first approach[C]//Proceedings of the 2014 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), 2014: 24-27.
[9] Zurita-Milla R, Clevers J G P W, Schaepman M E. Unmixing-based landsat TM and MERIS FR data fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2008, 5(3): 453-457. DOI: 10.1109/LGRS.2008.919685
[10] Yokoya N, Yairi T, Iwasaki A. Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2012, 50(2): 528-537. DOI: 10.1109/TGRS.2011.2161320
[11] Lanaras C, Baltsavias E, Schindler K. Hyperspectral super-resolution by coupled spectral unmixing[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), 2015: 3586-3594.
[12] WEI Q, Dobigeon N, Tourneret J Y. Fast fusion of multi-band images based on solving a sylvester equation[J]. IEEE Transactions on Image Processing, 2015, 24(11): 4109-4121. DOI: 10.1109/TIP.2015.2458572
[13] Akhtar N, Shafait F, Mian A. Bayesian sparse representation for hyperspectral image super resolution[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015: 7-12.
[14] 孙佳敏, 宋慧慧. 基于DWT和生成对抗网络的高光谱多光谱图像融合[J]. 无线电工程, 2021, 51(12): 1434-1441. SUN Jiamin, SONG Huihui. Hyperspectral and multispectral image fusion based on discrete wavelet transform and fenerative adversarial networks[J]. Radio Engineering, 2021, 51(12): 1434-1441.
[15] DIAN R, FANG L, LI S. Hyperspectral image super-resolution via non-local sparse tensor factorization[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 21-26.
[16] DIAN R, LI S, FANG L, et al. Nonlocal sparse tensor factorization for semiblind hyperspectral and multispectral image fusion[J]. IEEE Transactions on Cybernetics, 2020, 50(10): 4469-4480. DOI: 10.1109/TCYB.2019.2951572
[17] LI S, DIAN R, FANG L, et al. Fusing hyperspectral and multispectral images via coupled sparse tensor factorization[J]. IEEE Transactions on Image Processing, 2018, 27(8): 4118-4130. DOI: 10.1109/TIP.2018.2836307
[18] Kanatsoulis C I, Fu X, Sidiropoulos N D, et al. Hyperspectral super-resolution: a coupled tensor factorization approach[J]. IEEE Transactions on Signal Processing, 2018, 66(24): 6503-6517. DOI: 10.1109/TSP.2018.2876362
[19] LI J, ZHENG K, YAO J, et al. Deep unsupervised blind hyperspectral and multispectral data fusion[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
[20] WANG X, WANG X, ZHAO K, et al. FSL-Unet: full-scale linked unet with spatial–spectral joint perceptual attention for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-14.
[21] DONG M, LI W, LIANG X, et al. MDCNN: multispectral pansharpening based on a multiscale dilated convolutional neural network[J]. Journal of Applied Remote Sensing, 2021, 15: 036516.
[22] Benzenati T, Kessentini Y, Kallel A. Pansharpening approach via two-stream detail injection based on relativistic generative adversarial networks[J]. Expert Systems with Applications, 2022, 188: 115996. DOI: 10.1016/j.eswa.2021.115996
[23] FU Y, XU T, WU X, et al. PPT Fusion: pyramid patch transformerfor a case study in image fusion[J]. arXiv preprint arXiv: 2107.13967, 2021.
[24] GAO S H, CHENG M M, ZHAO K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(2): 652-662. DOI: 10.1109/TPAMI.2019.2938758
[25] LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 10-17.
[26] Kingma D P, BA J. Adam: a method for stochastic optimization[J/OL]. arXiv: 1412.6980, 2014, https://doi.org/10.48550/arXiv.1412.6980.
[27] ZHANG X, HUANG W, WANG Q, et al. SSR-NET: spatial–spectral reconstruction network for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(7): 5953-5965. DOI: 10.1109/TGRS.2020.3018732
[28] LIU X, LIU Q, WANG Y. Remote sensing image fusion based on two-stream fusion network[J]. Information Fusion, 2020, 55: 1-15. DOI: 10.1016/j.inffus.2019.07.010
[29] XIE Q, ZHOU M, ZHAO Q, et al. MHF-Net: an interpretable deep network for multispectral and hyperspectral image fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1457-1473. DOI: 10.1109/TPAMI.2020.3015691
[30] HU J F, HUANG T Z, DENG L J, et al. Hyperspectral image super-resolution via deep spatiospectral attention convolutional neural networks [J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 7251-7265. DOI: 10.1109/TNNLS.2021.3084682
[31] HU J F, HUANG T Z, DENG L J, et al. Fusformer: a transformer-based fusion network for hyperspectral image super-resolution[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.
-
期刊类型引用(6)
1. 胡奕彬,周鑫,沈爱国,胡樾. 基于改进小波变换的图像去噪与融合. 齐齐哈尔大学学报(自然科学版). 2023(03): 33-38 . 百度学术
2. 王丽芳,米嘉,秦品乐,蔺素珍,高媛,刘阳. 改进U-Net3+与跨模态注意力块的医学图像融合. 中国图象图形学报. 2022(12): 3622-3636 . 百度学术
3. 马金磊,徐永强,李雷,余爱国,孟浩. 基于随机游走算法的红外与可见光图像融合. 电光与控制. 2021(10): 90-93+98 . 百度学术
4. 陈卓,方明,柴旭,付飞蚺,苑丽红. 红外与可见光图像融合的U-GAN模型. 西北工业大学学报. 2020(04): 904-912 . 百度学术
5. 许亚男,钱叶旺,王鞠庭. 结合K-SVD算法的金字塔变换域中的图像融合方法研究. 荆楚理工学院学报. 2020(05): 68-72+81 . 百度学术
6. 白玉,侯志强,刘晓义,马素刚,余旺盛,蒲磊. 基于可见光图像和红外图像决策级融合的目标检测算法. 空军工程大学学报(自然科学版). 2020(06): 53-59+100 . 百度学术
其他类型引用(24)