结合并行池化与自蒸馏的YOLO红外目标检测算法

周华平; 吴劲; 李敬兆; 吴涛

结合并行池化与自蒸馏的YOLO红外目标检测算法

YOLO Infrared Target Detection Algorithm Combining Parallel Pooling and Self-Distillation

摘要

摘要: 针对红外目标检测中对比度低和尺度差异导致的检测精度低的问题，本文提出了一种PPSD-YOLO（parallel pooling and self distillation-YOLO）红外目标检测算法。首先，构建了并行池化模块Fusion-P，使目标附近的像素点更为平滑，防止目标像素缺失。其次，针对红外小目标检测精度低的问题，在网络中增加小目标检测层，并采用K-means++算法优化其初始化锚框大小及比例。然后，在Neck层中引入多尺度特征感知模块（SA-RFE），通过多分支空洞卷积结构融合目标的多尺度上下文信息。最后，在训练过程中构建了一种修正自蒸馏框架，动态修正教师模型中的误检目标从而提高学生模型检测精度。在FLIR红外数据集上的结果表明，改进后的PPSD-YOLO较YOLO v7算法mAP提高了2.7%。

Abstract: This study addresses low accuracy in infrared object detection, which is often caused by low contrast and scale discrepancies. A parallel-pooling and self-distillation-YOLO (PPSD-YOLO) algorithm that incorporates several crucial features was proposed to solve this challenge. One of the main contributions of this study is the Fusion-P parallel pooling module, which smoothes the surrounding pixels and avoids overlooking important details. In addition, a small infrared object detection layer was added to enhance the detection accuracy of such objects. The initialization anchor frame for this layer was optimized using the K-means++ algorithm. The proposed algorithm includes a multiscale feature perception module (SA-RFE) in the neck layer, which fuses contextual information from various scales of the target for more accurate detection. During the training process, a modified self-distillation framework was used to rectify misdetected targets in the teacher model, leading to improved detection accuracy in the student model. Tests were conducted using the FLIR dataset to evaluate the proposed algorithm. The results show that PPSD-YOLO outperformed YOLOv7 by 2.7% in terms of mAP. This improvement can be attributed to the incorporation of a parallel pooling module, small-object detection layer, SA-RFE module, and a self-distillation framework. This study presents a comprehensive solution to the challenge of low detection accuracy in infrared object detection. The proposed PPSD-YOLO algorithm integrates advanced features that enhance accuracy and improve the overall performance of the detection system. These findings will be useful for researchers and practitioners in computer vision.

HTML全文

参考文献(20)

施引文献

资源附件(0)