Abstract:
A multiscale and convolutional attention-based infrared and visible image fusion algorithm is proposed to address the issues of insufficient single-scale feature extraction and loss of details, such as infrared targets and visible textures, when fusing infrared and visible images. First, an encoder network, combining a multiscale feature extraction module and deformable convolutional attention module, is designed to extract important feature information of infrared and visible images from multiple receptive fields. Subsequently, a fusion strategy based on spatial and channel dual-attention mechanisms is adopted to further fuse the typical features of infrared and visible images. Finally, a decoder network composed of three convolutional layers is used to reconstruct the fused image. Additionally, hybrid loss function constraint network training based on mean squared error, multiscale structure similarity, and color is designed to further improve the similarity between the fused and source images. The results of the experiment are compared with seven image-fusion algorithms using a public dataset. In terms of subjective and objective evaluations, the proposed algorithm exhibits better edge preservation, source image information retention, and higher fusion image quality than other algorithms.