CHEN Gao, WANG Weihua, LIN Dandan. Infrared Vehicle Target Detection Based on Convolutional Neural Network without Pre-training[J]. Infrared Technology , 2021, 43(4): 342-348.
Citation: CHEN Gao, WANG Weihua, LIN Dandan. Infrared Vehicle Target Detection Based on Convolutional Neural Network without Pre-training[J]. Infrared Technology , 2021, 43(4): 342-348.

Infrared Vehicle Target Detection Based on Convolutional Neural Network without Pre-training

More Information
  • Received Date: July 16, 2020
  • Revised Date: July 29, 2020
  • To tackle the over-dependence of convolutional neural network-based target detection algorithms on pre-training weights, especially for target detection of infrared scenarios under data-sparse conditions, the incorporation of attention modules is proposed to alleviate the degradation of detection performance owing to the absence of pre-training. This paper is based on the YOLO v3 algorithm, which incorporates SE and CBAM modules in a network that mimics human attentional mechanisms to recalibrate the extracted features at the channel and spatial levels. Different weights are adaptively assigned to the features according to their importance, which ultimately improves the detection accuracy. On the constructed infrared vehicle target dataset, the attention module significantly improved the detection accuracy of the non-pre-trained convolutional neural network. Furthermore, the detection accuracy of the network incorporating the CBAM module was 86.3 mAP, demonstrating that the attention module can improve the feature extraction ability of the network and free the network from over-reliance on the pretrained weights.
  • [1]
    Otsu N. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1979, 9(1): 62-66. DOI: 10.1109/TSMC.1979.4310076
    [2]
    Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005: 886-893.
    [3]
    Suykens J A K, Vandewalle J. Least squares support vector machine classifiers[J]. Neural Processing Letters, 1999, 9(3): 293-300. DOI: 10.1023/A:1018628609742
    [4]
    REN S, HE K, Girshick R, et al. Faster r-cnn: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(6): 1137-1149 DOI: 10.1109/TPAMI.2016.2577031
    [5]
    QIN Z, LI Z, ZHANG Z, et al. ThunderNet: Towards real-time generic object detection on mobile devices[C]//Proceedings of the IEEE International Conference on Computer Vision, 2019: 6718-6727.
    [6]
    Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
    [7]
    DUAN K, BAI S, XIE L, et al. Centernet: keypoint triplets for object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2019: 6569-6578.
    [8]
    TANG T, ZHOU S, DENG Z, et al. Vehicle detection in aerial images based on region convolutional neural networks and hard negative example mining[J]. Sensors, 2017, 17(2): 336. DOI: 10.3390/s17020336
    [9]
    DENG J, DONG W, Socher R, et al. Imagenet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009: 248-255.
    [10]
    LIN T Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context[C]//Proceedings of the European Conference on Computer Vision, 2014: 740-755.
    [11]
    Redmon J, Farhadi A. YOLOv3: An incremental improvement[DB/OL]. https://arxiv.org/abs/1804.02767.2020-0703.
    [12]
    HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
    [13]
    Woo S, Park J, Lee J Y, et al. Cbam: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany: Springer, 2018: 3-19.
    [14]
    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
    [15]
    Krishna K, Murty M N. Genetic K-means algorithm[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 1999, 29(3): 433-439. DOI: 10.1109/3477.764879
  • Related Articles

    [1]LIU Xiaopeng, ZHANG Tao. Global-Local Attention-Guided Reconstruction Network for Infrared Image[J]. Infrared Technology , 2024, 46(7): 791-801.
    [2]LI Li, YI Shi, LIU Xi, CHENG Xinghao, WANG Cheng. Infrared Image Deblurring Based on Dense Residual Generation Adversarial Network[J]. Infrared Technology , 2024, 46(6): 663-671.
    [3]CHEN Xin. Infrared and Visible Image Fusion Using Double Attention Generative Adversarial Networks[J]. Infrared Technology , 2023, 45(6): 639-648.
    [4]WU Yuanyuan, WANG Zhishe, WANG Junyao, SHAO Wenyu, CHEN Yanlin. Infrared and Visible Image Fusion Using Attention- Based Generative Adversarial Networks[J]. Infrared Technology , 2022, 44(2): 170-178.
    [5]HUANG Mengtao, GAO Na, LIU Bao. Image Deblurring Method Based on a Dual-Discriminator Weighted Generative Adversarial Network[J]. Infrared Technology , 2022, 44(1): 41-46.
    [6]SONG Jingjing, LI Zhonghui, ZHANG Xin, TIAN He, ZHENG Anqi, ZANG Zesheng, ZHANG Quancong. Research on Normalized Histogram Characterization of Infrared Thermal Image of Rock Sample Damage[J]. Infrared Technology , 2021, 43(8): 777-783.
    [7]LUO Di, WANG Congqing, ZHOU Yongjun. A Visible and Infrared Image Fusion Method based on Generative Adversarial Networks and Attention Mechanism[J]. Infrared Technology , 2021, 43(6): 566-574.
    [8]XU Hangwei, ZHAO Zhuang, YUE Jiang, BAI Lianfa. Real-time Unsupervised Classification Method of Hyperspectral Images Based on the Normalized Spectral Vector[J]. Infrared Technology , 2018, 40(4): 362-368.
    [9]GUO Jingbin, FENG Huajie, WANG Long, PENG Qinjian, LI Xingfei. Design of Focusing Window Based on Energy Function of Gradient[J]. Infrared Technology , 2016, 38(3): 197-202.
    [10]LIU Gang, HAN Jian-dong. A New 2*Image Interpolation Based on Gradient[J]. Infrared Technology , 2006, 28(6): 324-326. DOI: 10.3969/j.issn.1001-8891.2006.06.004

Catalog

    Article views (304) PDF downloads (66) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return