Abstract:
To tackle the over-dependence of convolutional neural network-based target detection algorithms on pre-training weights, especially for target detection of infrared scenarios under data-sparse conditions, the incorporation of attention modules is proposed to alleviate the degradation of detection performance owing to the absence of pre-training. This paper is based on the YOLO v3 algorithm, which incorporates SE and CBAM modules in a network that mimics human attentional mechanisms to recalibrate the extracted features at the channel and spatial levels. Different weights are adaptively assigned to the features according to their importance, which ultimately improves the detection accuracy. On the constructed infrared vehicle target dataset, the attention module significantly improved the detection accuracy of the non-pre-trained convolutional neural network. Furthermore, the detection accuracy of the network incorporating the CBAM module was 86.3 mAP, demonstrating that the attention module can improve the feature extraction ability of the network and free the network from over-reliance on the pretrained weights.