一种基于多尺度的目标检测锚点构造方法

Multi-scale Anchor Construction Method for Object Detection

  • 摘要: 目标检测是计算机视觉领域的研究热点和基础任务,其中基于锚点(Anchor)的目标检测已在众多领域得到广泛应用。当前锚点选取方法主要面临两个问题:基于特定数据集的先验取值尺寸固定、面对不同场景泛化能力弱。计算锚框的无监督K-means算法,受初始值影响较大,对目标尺寸较单一的数据集聚类产生的锚点差异较小,无法充分体现网络多尺度输出的特点。针对上述问题,本文提出一种基于多尺度的目标检测锚点构造方法(multi-scale-anchor, MSA),将聚类产生的锚点根据数据集本身的特性进行尺度的缩放和拉伸,优化的锚点即保留原数据集的特点也体现了模型多尺度的优势。另外,本方法应用在训练的预处理阶段,不增加模型推理时间。最后,选取单阶段主流算法YOLO(You Only Look Once),在多个不同场景的红外或工业场景数据集上进行丰富的实验。结果表明,多尺度锚点优化方法MSA能显著提高小样本场景的检测精度。

     

    Abstract: Object detection is a popular research topic and fundamental task in computer vision. Anchor-based object detection has been widely used in many fields. Current anchor selection methods face two main problems: a fixed size of a priori values based on a specific dataset and a weak generalization ability in different scenarios. The unsupervised K-means algorithm for calculating anchor frames, which is significantly influenced by initial values, generates less variation in anchor points for clustering datasets with a single object size and cannot reflect the multiscale output of the network. In this study, a multiscale anchor (MSA) method that introduces multiscale optimization was developed to address these issues. This method scales and stretches the anchor points generated by clustering according to the dataset characteristics. The optimized anchor points retain the characteristics of the original dataset and reflect the advantages of the multiple scales of the model. In addition, this method was applied to the preprocessing phase of training without increasing the model inference time. Finally, the single-stage mainstream algorithm, You Only Look Once (YOLO), was selected to perform extensive experiments on different scenes of the infrared and industrial scene datasets. The results show that the MSA method can significantly improve the detection accuracy of small-sample scenes.

     

/

返回文章
返回