FSTD-YOLO：聚焦小目标的航拍图像目标检测算法

季嘉豪; 袁程胜; 王一力

FSTD-YOLO：聚焦小目标的航拍图像目标检测算法

FSTD-YOLO: A Focused Small Target Detection Algorithm for Aerial Imagery

摘要

摘要: 针对无人机航拍中存在的复杂背景下微小目标辨识难度大、检测精度较低等问题，本文提出一种基于YOLOv8n改进的聚焦小目标的航拍图像检测算法Focused Small Target Detection-YOLO（FSTD-YOLO）。本研究设计的MRFF特征提取模块（Multi-Receptive Field Feature Extraction Module）能够增强复杂背景下模型的学习能力，避免随着网络加深而产生的小目标信息丢失、特征学习不足等问题。通过引入微小目标检测头，移除效率不高的大目标检测头，使模型更专注于小目标的检测以减少非相关尺寸目标的关注。通过引入多重注意力机制模块增强模型对于多尺度目标的识别能力。实验中，本文使用Inner-MPDIoU损失函数替代CIoU损失函数，更精确地衡量预测框与真实框之间的差异，促进航拍图像中低IoU样本的快速收敛。实验结果表明，在VisDrone数据集上，与YOLOv8n相比，FSTD-YOLO的mAP@0.5和mAP@0.5~0.95分别提高了9.1%和6.1%。同时，参数量减少了13.0%，权重文件大小减少了11.1%，更有利于移动端的部署。

Abstract: To address the challenges of detecting small targets in complex backgrounds and the low detection accuracy of UAV aerial imagery, an improved algorithm based on YOLOv8n for aerial image detection, named Focused Small Target Detection-YOLO(FSTD-YOLO) is proposed. A novel Multi-Receptive Field Feature Extraction Module(MRFF) is presented to enhance the learning capability of complex backgrounds and avoid mitigating issues such as the loss of small target information and the lack of feature learning. The use of micro target detection heads and the removal of inefficient large target detection heads, which makes the model more focus on the detection of small targets and reduce the concern of targets with irrelevant sizes. Additionally, a multi-attention mechanism module is introduced to enhance the ability of recognizing multi-scale targets. In the experiments, the Inner-MPDIoU loss function is adopted to replace the CIoU loss function, providing a more precise measure of the difference between predicted and ground truth bounding boxes and facilitating faster convergence for low IoU samples in aerial images. Experimental results demonstrate that, compared to YOLOv8n, FSTD-YOLO achieves improvements of 9.1% in mAP@0.5 and 6.1% in mAP@0.5~0.95 on the VisDrone dataset. Furthermore, the number of parameters is reduced by 13.0%, and the weight file size is reduced by 11.1%, making it more suitable for deployment on mobile devices.

HTML全文

参考文献(29)

施引文献

资源附件(0)