Abstract:
In the fields of infrared and visible image fusion, traditional image fusion methods only use the same convolution operation to extract local features, which may lead to the absence of semantic and texture detail information in the fused results. Therefore, this paper proposes a two-branch network approach for image fusion. First, to improve the ability of the model in describing low-level details and high-level semantics of the image, the source image is input into the two-branch structure of the encoder, and detail information is extracted separately but in parallel with semantic information. Second, gradient residual dense block-reinforced coding network was used to enhance the model's ability to describe fine-grained information. The feature fusion network uses the strategy of bilateral bootstrap aggregation layer for the deep feature fusion of the two branches. Finally, the proposed method was compared with seven other fusion methods on the TNO public dataset and experimentally ablated. The results show that the fused images obtained by the proposed method are rich in information, more suitable for human visual perception, and have significant advantages in objective indices such as peak signal-to-noise ratio, sum of difference correlation, and visual information fidelity.