改进YOLOv10-n的密集行人检测模型

Improved YOLOv10-n Dense Pedestrian Detection Model

  • 摘要: 密集行人检测中目标数量多且常伴随遮挡、复杂背景等干扰因素,易出现检测精度不足及漏检、误检等问题。为了解决这些挑战,开发了一种YOLOv10-n改进的模型MSD-YOLO(C2f-MCLU、STNet、DyHead-DCNv4)。提出了C2f-MCLU模块,通过在通道维度与空间位置之间建立紧密的相互依赖关系,有效提升了特征表达的能力。设计了小目标增强的双向融合金字塔结构来重构颈部网络,使模型能提取更多的细微特征。构建了DyHead-DCNv4检测头,进一步提升了对严重遮挡人群的识别能力。实验结果表明,相较于YOLOv10-n,改进模型在Crowd Human和WiderPerson数据集上的精度分别提升了3.3%和1.5%,而参数量仅有3.0M,计算量为10.6GFLOPs,满足低算力环境下实现高精度的部署的需求。

     

    Abstract: In dense pedestrian detection, the number of targets is large and often accompanied by interference factors such as occlusion and complex background, which is easy to cause problems such as insufficient detection accuracy, missing detection and false detection. To address these challenges, a YOLOv10-n modified model MSD-YOLO (C2f-MCLU, STNet, DyHead-DCNv4) was developed. The C2f-MCLU module is proposed, which effectively improves the capability of feature expression by establishing a close interdependence between channel dimension and spatial position. A bidirectional fusion pyramid structure with small target enhancement is designed to reconstruct the neck network, so that the model can extract more subtle features. The DyHead-DCNv4 detection head was constructed to further improve the recognition ability of severely occluded people. The experimental results show that compared with YOLOv10-n, the accuracy of the improved model on the Crowd Human and Wider Person data sets is increased by 3.3% and 1.5%, respectively, while the parameter number is only 3.0M and the computation amount is 10.6GFLOPs, which meets the requirements of high-precision deployment under the environment of low computing power.

     

/

返回文章
返回