Abstract:
Infrared small target detection (IRSTD) aims to detect and identify small and dim targets in infrared images with complex backgrounds. It is widely used in security, drones, the military, and other fields. In this task, the challenge is that infrared images usually have low resolution, low target contrast, and blurred textures, causing small targets to be easily lost in backgrounds containing noise and clutter. Therefore, accurate detection of the shape information of small infrared targets is currently an important issue explored by the academic community. To solve these problems, an IRSTD algorithm based on the frequency-domain information-guided transformer (FDGformer) network is proposed. First, the popular U-net architecture is used to generate the target mask. After exploring the frequency domain information at different levels of infrared images, a frequency information extraction (FIE) module based on the Transformer structure is constructed. The self-attention of the features calculated in the frequency domain is used to enhance specific frequency components in the input features. Subsequently, a spatial Transformer structure is designed guided by frequency information and the calculated frequency domain enhanced features, to integrate infrared features. Global dependencies and significant frequency-domain information can accurately identify the appearance characteristics of small targets. Experimental results on public datasets show that this algorithm has higher detection accuracy and fewer parameters than other advanced small-target detection algorithms, effectively promoting the practical application of detection tasks.