Abstract:
The segmentation accuracy of substation equipment in infrared images captured by a UAV directly affects the results of thermal fault diagnosis. We proposed a multimodal path aggregation network (MPAN) that fuses visible and infrared images to address the problem of low segmentation accuracy of substation equipment in complex infrared backgrounds. First, we extracted and fused the features of two modal images, and considering the differences in the feature space of the two modal images, we proposed the adaptive feature fuse module (AFFM) to fuse the two modal features fully. We added a bottom-up pyramid network to the backbone with multi-scale features and a laterally connected path enhancement. Finally, we used dice coefficients to optimize the mask loss function. The experimental results showed that the fusion of multimodal images can enhance the segmentation performance and verify the effectiveness of the proposed modules, which can significantly improve the accuracy of the segmentation of substation equipment instances in infrared images.