基于DBB-YOLOv10s的无人机小目标检测

杨海涛; 赵俊羽; 王瑞; 王华朋; 向裕婧; 鄢喜爱; 孙展明; 熊英灼

基于DBB-YOLOv10s的无人机小目标检测

Small Object Detection for UAVs Based on DBB-YOLOv10s

摘要

摘要: 针对无人机小目标检测任务中复杂背景干扰和多尺度目标的挑战，本文提出了一种基于DBB-YOLOv10s的网络模型。该模型使用YOLOv10s作为基线系统，在结构中引入膨胀卷积（Dilated convolution-based cross stage partial with 2 convolutions and feature fusion，DC2f）以拓展感受野、双向特征金字塔（Bi-directional feature pyramid network, BiFPN）实现多尺度特征融合，以及瓶颈注意力模块（Bottleneck attention module, BAM）增强模型对目标区域的聚焦能力。通过在VisDrone2021数据集上的实验验证，本算法在平均检测精度（mAP 41.8%）和推理速度（148 FPS）方面展现了卓越的性能，同时将模型参数量控制在6.59 M，具备了较强的实用性。与现有YOLO模型及其改进版本相比，本文模型不仅在复杂场景下保持了较高的检测准确性，还在实时性和计算效率之间取得了平衡，适用于嵌入式系统及无人机实时监控任务。研究结果表明，该算法在复杂场景中的检测能力和泛化性均有显著提升。

Abstract: To address the challenges of complex background interference and multiscale target detection in UAV-based small object detection tasks, this study proposes a network model based on DBB-YOLOv10s. Using YOLOv10s as the baseline model, the proposed network incorporates dilated convolution (DC2f) to expand the receptive field, a bidirectional feature pyramid network (BiFPN) to achieve multiscale feature fusion, and a BAM attention mechanism to enhance the model’s focus on target regions. Experimental results on the VisDrone2021 dataset demonstrate that the proposed algorithm achieves strong performance, with a mean average precision (mAP) of 41.8% and an inference speed of 148 FPS, while maintaining a model size of 6.59 M parameters, ensuring strong practicality. Compared with existing YOLO models and their variants, the proposed model not only maintains high detection accuracy in complex scenarios but also balances real-time performance and computational efficiency, making it suitable for deployment in embedded systems and real-time UAV monitoring tasks. The results indicate that the proposed algorithm significantly enhances detection capability and generalization performance in complex environments.

HTML全文

参考文献(27)

施引文献

资源附件(0)