基于 MDA-YOLOv10 的融合图像车辆行人检测方法

崔家礼; 张林; 郑瀚

基于 MDA-YOLOv10 的融合图像车辆行人检测方法

Vehicle and Pedestrian Detection Method for Fusion Image Based on MDA-YOLOv10

摘要

摘要: 针对复杂交通场景下目标检测存在的尺度变化大、小目标检测困难、背景干扰大等问题，本研究提出一种基于MDA-YOLOv10的红外图像和可见光图像融合的车辆行人检测算法。首先，设计了MPDA模块，通过不同膨胀率的注意力头来捕捉多尺度信息，提高了模型对多尺度特征的提取能力。其次，利用DASI模块的多层次特征融合和自适应选择机制，选择性地保留目标的特征信息，提升了对小目标的检测效果。最后，通过引入ATFL损失函数对原损失函数进行优化，有效过滤背景噪声，使模型在背景复杂的情况下更好地聚焦于目标特征。实验结果显示，MDA-YOLOv10在M3FD数据集和MSRS数据集上mAP@0.5分别达到84.4%和81.6%，相比原始YOLOv10算法分别提升了3.8%和4.8%，表明该算法在复杂交通场景中具有较好的检测性能。

Abstract: To address the challenges of large-scale variations, difficulty in detecting small objects, and significant background interference in complex traffic scenarios, this study proposes a vehicle and pedestrian detection algorithm based on infrared and visible image fusion using MDA-YOLOv10. Firstly, the MPDA module was designed to capture multi-scale information through attention heads with different dilation rates, enhancing the model's ability to extract multi-scale features. Secondly, the DASI module integrates multi-level feature fusion and an adaptive selection mechanism, which selectively retains target feature information to improve small target detection performance. Finally, the ATFL loss function is introduced to optimize the original loss function, effectively filtering background noise and enabling the model to better focus on target features in complex backgrounds. Experimental results show that MDA-YOLOv10 achieves mAP@0.5 of 84.4% and 81.6% on the M3FD and MSRS datasets, respectively, representing improvements of 3.8% and 4.8% compared to the original YOLOv10 algorithm. This demonstrates that the proposed method delivers superior detection performance in complex traffic scenarios.

HTML全文

参考文献(0)

施引文献

资源附件(0)