Abstract:
Addressing the issues of inadequate feature extraction, lack of saliency in fused image regions, and missing detailed information in infrared-visible image fusion, this paper proposes a method for infrared-visible image fusion based on multi-scale contrast enhancement and a cross-modal interactive attention mechanism. The main components of the proposed method are as follows. 1) Multi-scale contrast enhancement module: Designed to strengthen the intensity information of target regions, facilitating the fusion of complementary information from both infrared and visible images. 2) Dense connection block: Employed for feature extraction to minimize information loss and maximize information utilization. 3) Cross-modal interactive attention mechanism: Developed to capture crucial information from both modalities and enhance the performance of the network. 4) Decomposition network: Designed to decompose the fused image back into source images, incorporating more scene details and richer texture information into the fused image. The proposed fusion framework was experimentally evaluated on the TNO dataset. The results show that the fused images obtained by this method feature significant target regions, rich detailed textures, better fusion performance, and stronger generalization ability. Additionally, the proposed method outperforms other compared algorithms in both subjective performance and objective evaluation.