基于跨模态多层次特征融合的电力设备检测算法

Cross-Modal Multilevel Feature Fusion-Based Algorithm for Power-Equipment Detection

  • 摘要: 针对复杂环境下电力设备检测算法鲁棒性较低和小目标检测不准确的问题,本文提出一种基于自适应融合和自注意力增强的跨模态多层次特征融合算法。首先构建双流特征提取网络,提取可见光图像和红外图像的多层级目标表征。通过引入自适应融合模块,捕捉可见光分支和红外分支两种模态下的互补特征,并进一步利用基于Transformer的自注意力机制来增强互补特征的语义空间信息。最后通过不同尺度下的深层特征来实现目标的精确定位。本文在自建电力设备数据集上进行充分实验,实验结果表明,所提算法的平均精确率均值(mAP50)可以达到91.7%,相较于单一可见光支路和单一红外支路,分别提升了3.5%和3.9%,有效地实现了跨模态信息的融合。与当前主流目标检测算法相比,展现出较高的鲁棒性。

     

    Abstract: A novel cross-modal multilevel feature fusion algorithm based on adaptive fusion and self-attention enhancement is proposed to address the low robustness of power-equipment detection algorithms and inaccurate small-target detection in complex environments. The algorithm begins by constructing a dual-stream feature-extraction network to extract multilevel target representations from visible-light and infrared images. An adaptive fusion module is introduced to capture complementary features from both the visible-light and infrared branches. Furthermore, a self-attention mechanism based on a Transformer is employed to enhance the semantic spatial information of the complementary features. Finally, precise target localization is achieved by utilizing deep features at different scales. Experimental evaluations were conducted on a custom-developed power-equipment dataset, and the results show that the proposed algorithm achieved an average precision mean value of 91.7%. Compared with using only the visible-light or infrared branch separately, the algorithm shows improvements of 3.5% and 3.9%, respectively, thus effectively achieving cross-modal information fusion. Compared with current mainstream object-detection algorithms, it exhibits superior robustness.

     

/

返回文章
返回