基于VSA-YOLO v11n的无人机红外小目标检测

马滔; 熊英灼; 潘庆娜; 杨海涛; 王华朋; 鄢喜爱; 陈睿

基于VSA-YOLO v11n的无人机红外小目标检测

Infrared Small Object Detection for UAV Imagery Based on VSA-YOLOv11n

摘要

摘要: 针对无人机红外图像中小目标热信号微弱、背景干扰复杂及尺度变化剧烈等问题，本文提出一种改进的轻量化检测模型VSA-YOLOv11n。该方法以YOLOv11n为基础，设计精简高效的骨干网络结构VanillaNet，以提升特征提取效率并显著降低推理延迟；在特征融合阶段引入结构化多尺度卷积模块，增强模型对复杂背景下不同尺度目标的感知能力；同时结合自适应空间特征融合检测头，实现跨尺度语义信息的精细聚合与选择性增强，从而提升模型对小目标的检测准确性与鲁棒性。在HIT-UAV红外小目标数据集上进行系统实验评估，结果表明，所提模型在精度、延迟与参数规模三方面均实现优化，mAP50达81.3%，推理延迟控制在1.79 ms，整体性能优于现有主流轻量化检测算法，尤其在满足高实时性需求的低空红外场景中表现出良好的实用性与可部署性。

Abstract: In this study, we propose an enhanced lightweight detection model named VSA-YOLOv11n to address the challenges of detecting small objects in infrared imagery collected by uncrewed aerial vehicles (UAVs) such as weak thermal signatures, complex background interference, and significant scale variation. The proposed model is based on the YOLOv11n architecture and integrates a streamlined and efficient backbone network called VanillaNet to improve feature extraction capability while significantly reducing inference latency. A structured multi-scale convolutional module is introduced in the feature fusion stage to enhance the model's sensitivity to targets of varying sizes in cluttered backgrounds. Furthermore, an adaptive spatial feature fusion (ASFF) head is incorporated to enable fine-grained semantic aggregation and selective enhancement across scales to improve detection accuracy and robustness for small infrared targets. The results of extensive experiments conducted on the HIT-UAV infrared small object dataset demonstrate that the proposed model achieved comprehensive improvements in terms of accuracy, inference speed, and parameter efficiency. Specifically, the model attained an mAP50 of 81.3% with an inference latency of only 1.79 ms, which thus outperformed existing mainstream lightweight detectors. These results highlight the model's strong practical applicability and deployment potential, particularly in low-altitude infrared scenarios with high real-time requirements.

HTML全文

参考文献(20)

施引文献

资源附件(0)