Abstract:
Owing to the problems of high noise and poor contrast in infrared images, the accuracy of target detection is easily reduced. Here, an improved YOLOX model combined with YOLOX and a Swin Transformer is proposed. To improve the feature extraction ability, reduce the activation functions and standardization layers of the neck and head parts in YOLOX, and optimize the network structure, the Swin Transformer is used to replace the CSPDarknet backbone extraction network in YOLOX. This study tests the improved model on both the InfiRay and FILR datasets. The obtained experimental results indicate that the improved YOLOX network has significantly improved the average detection accuracy on both datasets and is more suitable for infrared image target detection.