杜妮妮, 单凯东, 卫莎莎. LPformer:基于拉普拉斯金字塔多级Transformer的红外小目标检测[J]. 红外技术, 2023, 45(6): 630-638.
引用本文: 杜妮妮, 单凯东, 卫莎莎. LPformer:基于拉普拉斯金字塔多级Transformer的红外小目标检测[J]. 红外技术, 2023, 45(6): 630-638.
DU Nini, SHAN Kaidong, WEI Shasha. LPformer: Laplacian Pyramid Multi-Level Transformer for Infrared Small Target Detection[J]. Infrared Technology , 2023, 45(6): 630-638.
Citation: DU Nini, SHAN Kaidong, WEI Shasha. LPformer: Laplacian Pyramid Multi-Level Transformer for Infrared Small Target Detection[J]. Infrared Technology , 2023, 45(6): 630-638.

LPformer:基于拉普拉斯金字塔多级Transformer的红外小目标检测

LPformer: Laplacian Pyramid Multi-Level Transformer for Infrared Small Target Detection

  • 摘要: 红外小目标检测是指从红外图像中分割出小目标,在火灾探测系统和海上监视及救援系统应用中具有重要意义。然而,由于目标尺寸小、特征不明显、背景环境复杂等因素,导致目前红外小目标检测算法的检测性能通常受到限制。针对上述问题,设计了一种基于拉普拉斯金字塔多级Transformer的红外小目标检测算法。首先,由于红外小目标尺寸较小,容易在网络迭代过程中损失纹理细节信息,利用拉普拉斯金字塔从原始输入的红外图像中提取出不同层级的高频边界信息,进一步通过一种结构信息转换模块与主干网络中不同层级的特征进行融合,用于对损失的纹理信息进行补偿;接着为了进一步提升网络的判别能力,在提高检测准确率的同时抑制虚警率,还采用了一种基于通道维的Transformer结构,将每个通道特征图作为图像块,并沿着通道维进行自注意力的计算。实验结果表明,与目前先进的检测算法相比,本文所提出的算法具有更高的检测性能。

     

    Abstract: Infrared small target detection refers to the segmentation of small targets from infrared images. This is of significance in the application of fire detection systems, maritime surveillance, and other rescue systems. However, because of factors such as small target size, inconspicuous features, and complex background environment, the detection performance of current infrared small target detection algorithms is generally limited. To address this issue, an infrared small target detection algorithm based on the Laplacian pyramid multi-level transformer (LPformer) was designed in this study. During network iteration, small infrared targets are prone to losing texture detail information owing to their small size. The Laplacian pyramid was used to extract different levels of high-frequency boundary information from the original input infrared image. A structural information conversion module was then fused with the features of different levels in the backbone network to compensate for the lost texture information. Next, to further improve the discriminative ability of the network and suppress the false alarm rate while improving the detection accuracy, a channel-based transformer structure that takes each channel feature map as tokens was also adopted. This calculated the self-attention map along the channel dimension. Experimental results demonstrated that the detection performance of the proposed algorithm was higher than that of current advanced detection algorithms.

     

/

返回文章
返回