MSAF-YOLO:一种高性能的航拍图像小目标检测网络

MSAF-YOLO: A High-Performance Aerial Image Small Object Detection Network

  • 摘要: 航拍小目标检测中存在的特征表示不足、背景混淆、分布密集等问题,使得小目标检测任务十分艰巨。特别是,当算法被部署在计算资源有限的环境中时,这需要在有限的计算资源下对精度和速度进行广泛的优化。针对这些问题,本文提出了一种轻量级检测模型: MSAF-YOLO。该模型整合了特征聚合、特征增强和空间感知,以提高检测效果。通过优化网络结构,去除了用于大目标检测的P5检测头,而增加了P2检测头,用于加强模型对小目标的关注。此外,还提出了三个创新性的模块分别是:多尺度空间聚合模块(MSAM)、多尺度边缘增强模块(MEFEM)和通道空间注意力模块(CSAM),以分别增强模型的多尺度融合能力、局部特征感知能力以及跨通道、跨空间的全局关联能力,同时避免了模型复杂度的增加。在Visdrone2019数据集上的实验结果表明,与基准模型YOLOv8s相比,MSAF-YOLO的mAP50提高了8.2%,mAP50-95提高了5.9%,同时参数量减少了56.3%。在小目标检测任务中展现出了优越的性能,证明了所提出方法的有效性和实用价值。

     

    Abstract: Aerial small object detection faces significant challenges due to insufficient feature representation, background confusion, and dense object distribution. These issues make the task particularly difficult, especially when algorithms are deployed in environments with limited computational resources, where both accuracy and speed must be extensively optimized. To address these problems, this paper proposes a lightweight detection model: MSAF-YOLO. The model integrates feature aggregation, feature enhancement, and spatial awareness to improve detection performance. By optimizing the network structure, the detection head P5—originally used for large object detection—is removed, and a P2 detection head is added to strengthen the model’s focus on small objects. MSAF-YOLO introduces three innovative modules: the MultiScale Spatial Aggregation Module (MSAM), the Multi-Scale Edge Feature Enhancement Module (MEFEM), and the Channel-Spatial Attention Module (CSAM). These modules respectively enhance the model's capabilities in multi-scale feature fusion, local feature perception, and global cross-channel and cross-spatial correlation, without increasing model complexity. Experimental results on the Visdrone2019 dataset show that compared to the baseline model YOLOv8s, MSAF-YOLO achieves an 8.2% improvement in mAP50 and a 5.9% improvement in mAP50-95, while reducing the number of parameters by 56.3%. The proposed method demonstrates superior performance in small object detection tasks, verifying its effectiveness and practical value.

     

/

返回文章
返回