结合CNN-Transformer特征交互的红外与可见光图像融合方法

Method for Infrared and Visible Image Fusion Combining CNN and Transformer Feature Interaction

  • 摘要: 针对CNN与Transformer提取的特征之间交互作用未充分挖掘而导致的融合图像易产生红外特征分布不均匀、轮廓不清晰以及重要背景信息丢失等问题,本文提出了一种新的结合CNN-Transformer特征交互的红外与可见光图像融合网络。首先,新融合网络设计了新的空间通道混合注意力机制以提升全局及局部特征的提取效率并得到混合特征块;其次,利用CNN-Transformer的特征交互获取融合混合特征块,并构建多尺度重构网络以实现图像特征重构输出;最后,使用TNO数据集将新融合网络与其它9种融合网络进行对比图像融合实验。实验结果表明,新融合网络获得的融合图像在视觉感知方面表现优异,既突出了红外特征和物体轮廓,又保留了丰富的背景纹理细节;网络在EN、SD、AG、SF、SCD以及VIF指标上相较于现有融合网络平均提高约64.73%、8.17%、69.05%、66.34%、15.39%和25.66%。消融实验证明了新模型的有效性。

     

    Abstract: To address the issues of uneven infrared feature distribution, indistinct contours, and loss of crucial background information in fused images caused by insufficient examination of the interaction between CNN- and Transformer-extracted features, this paper proposes a novel infrared and visible image fusion network incorporating CNN–Transformer feature interaction. First, the new fusion network designs a novel spatial-channel hybrid attention mechanism to enhance the extraction efficiency of both global and local features, thus yielding hybrid feature blocks. Second, feature interaction between a CNN and Transformer is leveraged to obtain fused hybrid feature blocks, and a multiscale reconstruction network is constructed to achieve image feature reconstruction for the output. Finally, comparative image fusion experiments are conducted on the TNO dataset between the proposed network and nine other fusion networks. The experimental results show that the fused images obtained by the new network exhibit excellent visual perception, i.e., it effectively highlights infrared features and object contours while preserving rich background texture details. The network achieves average improvements of approximately 64.73%, 8.17%, 69.05%, 66.34%, 15.39%, and 25.66% over existing fusion networks on the EN, SD, AG, SF, SCD, and VIF metrics, respectively. Ablation experiments further validated the effectiveness of the new model.

     

/

返回文章
返回