Abstract:
This study addresses low accuracy in infrared object detection, which is often caused by low contrast and scale discrepancies. A parallel-pooling and self-distillation-YOLO (PPSD-YOLO) algorithm that incorporates several crucial features was proposed to solve this challenge. One of the main contributions of this study is the Fusion-P parallel pooling module, which smoothes the surrounding pixels and avoids overlooking important details. In addition, a small infrared object detection layer was added to enhance the detection accuracy of such objects. The initialization anchor frame for this layer was optimized using the K-means++ algorithm. The proposed algorithm includes a multiscale feature perception module (SA-RFE) in the neck layer, which fuses contextual information from various scales of the target for more accurate detection. During the training process, a modified self-distillation framework was used to rectify misdetected targets in the teacher model, leading to improved detection accuracy in the student model. Tests were conducted using the FLIR dataset to evaluate the proposed algorithm. The results show that PPSD-YOLO outperformed YOLOv7 by 2.7% in terms of mAP. This improvement can be attributed to the incorporation of a parallel pooling module, small-object detection layer, SA-RFE module, and a self-distillation framework. This study presents a comprehensive solution to the challenge of low detection accuracy in infrared object detection. The proposed PPSD-YOLO algorithm integrates advanced features that enhance accuracy and improve the overall performance of the detection system. These findings will be useful for researchers and practitioners in computer vision.