YE Baicheng, ZHU Youpan, ZHOU Yongkang, DUAN Chenhao, ZHANG Yudong, TAO Zhigang, FU Zhiyu. Review of Lightweight Target Detection Algorithms[J]. Infrared Technology , 2025, 47(3): 289-298.
Citation: YE Baicheng, ZHU Youpan, ZHOU Yongkang, DUAN Chenhao, ZHANG Yudong, TAO Zhigang, FU Zhiyu. Review of Lightweight Target Detection Algorithms[J]. Infrared Technology , 2025, 47(3): 289-298.

Review of Lightweight Target Detection Algorithms

More Information
  • Received Date: December 05, 2023
  • Revised Date: January 23, 2024
  • Traditional target detection algorithms based on deep learning usually require extensive computing resources and long-term training, which do not meet the needs of the industry. Lightweight target detection networks sacrifice part of the detection accuracy in exchange for faster inference speed and lighter models. They are suitable for applications in edge-computing devices and have received widespread attention. This study introduces lightweight technologies commonly used to compress and accelerate models, classifies and analyzes the structural principles of lightweight backbone networks, and evaluates their practical impact on YOLOv5s. Finally, the prospects and challenges of lightweight target-detection algorithms are discussed.

  • [1]
    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
    [2]
    陈东, 刘宁. 深度学习中的模型压缩技术[J]. 人工智能, 2023(3): 40-51. DOI: 10.16453/j.2096-5036.2023.03.004.

    CHEN Dong, LIU Ning. Model compression technology in deep learning[J]. Artificial Intelligence, 2023(3): 40-51. DOI: 10.16453/j.2096-5036.2023.03.004.
    [3]
    HAN S, Pool J, Tran J, et al. Learning both Weights and Connections for Efficient Neural Networks[J]. arXiv e-prints arXiv: 1506.026262015.
    [4]
    Alizadeh M, Tailor S A, Zintgraf L M, et al. Prospect pruning: Finding trainable weights at initialization using meta-gradients[J]. arXiv preprint arXiv: 2202.08132, 2022.
    [5]
    HE Y, KANG G, DONG X, et al. Soft filter pruning for accelerating deep convolutional neural networks[J]. arXiv preprint arXiv: 1808.06866, 2018.
    [6]
    HE Y, LIU P, WANG Z, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4340-4349.
    [7]
    FANG G, MA X, SONG M, et al. Depgraph: towards any structural pruning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 16091-16101.
    [8]
    GONG Y, LIU L, YANG M, et al. Compressing deep convolutional networks using vector quantization[J]. arXiv preprint arXiv: 1412.6115, 2014.
    [9]
    Courbariaux M, Hubara I, Soudry D, et al. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or-1[J]. arXiv preprint arXiv: 1602.02830, 2016.
    [10]
    LI F, LIU B, WANG X, et al. Ternary weight networks[J]. arXiv preprint arXiv: 1605.04711, 2016.
    [11]
    Lebedev V, Ganin Y, Rakhuba M, et al. Speeding-up convolutional neural networks using fine-tuned cp-decomposition[J]. arXiv preprint arXiv: 1412.6553, 2014.
    [12]
    Kim Y D, Park E, Yoo S, et al. Compression of deep convolutional neural networks for fast and low power mobile applications[J]. arXiv preprint arXiv: 1511.06530, 2015.
    [13]
    Novikov A, Podoprikhin D, Osokin A, et al. Tensorizing Neural Networks[J]. arXiv preprint arXiv: 1509.06569, 2015.
    [14]
    Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv: 1503.02531, 2015.
    [15]
    CHEN Y, WANG N, ZHANG Z. DarkRank: accelerating deep metric learning via cross sample similarities transfer[J]. arXiv e-prints, arXiv: 1707.01220, 2017.
    [16]
    Sifre L, Mallat S. Rigid-motion scattering for texture classification[J]. arXiv preprint arXiv: 1403.1687, 2014.
    [17]
    Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097-105. http://www.open-open.com/misc/goto?guid=4959622549944527866
    [18]
    YU F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv: 1511.07122, 2015.
    [19]
    DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 764-773.
    [20]
    ZHU X, HU H, LIN S, et al. Deformable convnets v2: More deformable, better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9308-9316.
    [21]
    WANG W, DAI J, CHEN Z, et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14408-14419.
    [22]
    Gennari M, Fawcett R, Prisacariu V A. DSConv: Efficient Convolution Operator[J]. arXiv preprint arXiv: 1901.01928, 2019.
    [23]
    LI H, LI J, WEI H, et al. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles[J]. arXiv preprint arXiv: 2206.02424, 2022.
    [24]
    HE X, ZHAO K, CHU X. AutoML: A survey of the state-of-the-art[J]. Knowledge-Based Systems, 2021, 212: 106622. http://www.sciencedirect.com/science/article/pii/S0950705120307516
    [25]
    Zoph B, Le Q V. Neural architecture search with reinforcement learning[J]. arXiv preprint arXiv: 1611.01578, 2016.
    [26]
    邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述[J]. 电子与信息学报, 2022, 44(10): 3697-3708. DOI: 10.11999/JEIT210790

    SHAO Yanhua, ZHANG Duo, CHU Hongyu, et al. A review of YOLO object detection based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697-3708. DOI: 10.11999/JEIT210790
    [27]
    Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size[J]. arXiv preprint arXiv: 1602.07360, 2016.
    [28]
    Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv: 1704.04861, 2017.
    [29]
    Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
    [30]
    Howard A, Sandler M, CHU G, et al. Searching for mobilenetv3[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
    [31]
    ZHANG X, ZHOU X, LIN M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
    [32]
    MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient CNN architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 116-131.
    [33]
    HAN K, WANG Y, TIAN Q, et al. Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
    [34]
    CUI C, GAO T, WEI S, et al. PP-LCNet: A lightweight CPU convolutional neural network[J]. arXiv preprint arXiv: 2109.15099, 2021.
    [35]
    Vasu P K A, Gabriel J, Zhu J, et al. MobileOne: an improved one millisecond mobile backbone[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7907-7917.
    [36]
    CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 12021-12031.
    [37]
    Mehta S, Rastegari M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv preprint arXiv: 2110.02178, 2021.
    [38]
    Maaz M, Shaker A, Cholakkal H, et al. Edgenext: efficiently amalgamated CNN-transformer architecture for mobile vision applications[C]//European Conference on Computer Vision, 2022: 3-20.
    [39]
    CAI H, GAN C, HAN S. Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition[J]. arXiv preprint arXiv: 2205.14756, 2022.
    [40]
    Vasu P K A, Gabriel J, ZHU J, et al. FastViT: A fast hybrid vision transformer using structural reparameterization[J]. arXiv preprint arXiv: 2303.14189, 2023.
    [41]
    WU B, DAI X, ZHANG P, et al. Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 10734-10742.
    [42]
    WAN A, DAI X, ZHANG P, et al. Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 12965-12974.
    [43]
    DAI X, WAN A, ZHANG P, et al. Fbnetv3: Joint architecture-recipe search using predictor pretraining[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 16276-16285.
    [44]
    XIONG Y, LIU H, Gupta S, et al. Mobiledets: Searching for object detection architectures for mobile accelerators[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3825-3834.
    [45]
    Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
    [46]
    Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
    [47]
    Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
    [48]
    Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
    [49]
    LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv: 2209.02976, 2022.
    [50]
    WANG C Y, Bochkovskiy A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
    [51]
    马雪浩. Yolo-Fastest: 超超超快的开源ARM实时目标检测算法[J/OL]. 漫步视觉, 2020: 1-147. [2023-10-20]. https://zhuanlan.zhihu.com/p/234506503.

    MA Xuehao. Yolo-Fastest: Super-super-fast open source ARM real-time object detection algorithm[J/OL]. Wandering Vision, 2020: 1-147. [2023-10-20]. https://zhuanlan.zhihu.com/p/234506503.
    [52]
    ZHANG Y, BI S, DONG M, et al. The implementation of CNN-based object detector on ARM embedded platforms[C]//2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, 2018: 379-382.
    [53]
    Womg A, Shafiee M J, LI F, et al. Tiny SSD: A tiny single-shot detection deep convolutional neural network for real-time embedded object detection[C]//2018 15th Conference on Computer and Robot Vision (CRV). IEEE, 2018: 95-101.
    [54]
    LI Y, LI J, LIN W, et al. Tiny-DSOD: Lightweight object detection for resource-restricted usages[J]. arXiv preprint arXiv: 1807.11013, 2018.
    [55]
    HU L, LI Y. Micro-YOLO: Exploring Efficient Methods to Compress CNN based Object Detection Model[C]//ICAART (2). 2021: 151-158.
    [56]
    ZHANG Y M, LEE C C, Hsieh J W, et al. CSL-YOLO: A new lightweight object detection system for edge computing[J]. arXiv preprint arXiv: 2107.04829, 2021.
    [57]
    GE Z, LIU S, WANG F, et al. YOLOx: Exceeding YOLO series in 2021[J]. arXiv preprint arXiv: 2107.08430, 2021.
    [58]
    CUI M, GONG G, CHEN G, et al. LC-YOLO: a lightweight model with efficient utilization of limited detail features for small object detection[J]. Applied Sciences, 2023, 13(5): 3174. http://openurl.ebsco.com/contentitem/doi:10.3390%2Fapp13053174?sid=ebsco:plink:crawler&id=ebsco:doi:10.3390%2Fapp13053174
    [59]
    WANG C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism[J]. arXiv preprint arXiv: 2309.11331, 2023.
    [60]
    RangiLyu. YOLO之外的另一选择: 手机端97FPS的Anchor-Free目标检测模型NanoDet现已开源[Z/OL]. 我爱计算机视觉, 2020: 1-405. [2023-10-20]. https://zhuanlan.zhihu.com/p/306530300.

    RangiLyu. Another option besides YOLO: NanoDet, an anchor-free target detection model with 97FPS on mobile phones, is now open source~[Z/OL]. I Love Computer Vision, 2020: 1-405. [2023-10-20]. https://zhuanlan.zhihu.com/p/306530300.
    [61]
    RangiLyu. 超简单辅助模块加速训练收敛, 精度大幅提升: 移动端实时的NanoDet升级版NanoDet-Plus来了![Z/OL]. CVer计算机视觉, 2022: 1-648. [2023-10-20]. https://zhuanlan.zhihu.com/p/449912627.

    RangiLyu. Super simple auxiliary module accelerates training convergence and greatly improves accuracy: NanoDet-Plus, a real-time mobile NanoDet upgrade, is here! [Z/OL]. CVer Computer Vision, 2022: 1-648. [2023-10-20]. https://zhuanlan.zhihu.com/p/449912627.
    [62]
    YU G, CHANG Q, LV W, et al. PP-PicoDet: A better real-time object detector on mobile devices[J]. arXiv preprint arXiv: 2111.00902, 2021.
    [63]
    ZHOU Q, SHI H, XIANG W, et al. DPNet: Dual-Path network for real-time object detection with lightweight attention[J]. arXiv preprint arXiv: 2209.13933, 2022.
    [64]
    TU P, XIE X, LING M, et al. FemtoDet: an object detection baseline for energy versus performance tradeoffs[J]. arXiv preprint arXiv: 2301.06719, 2023.
  • Related Articles

    [1]LIANG Xiuman, ZHAO Jiayang, YU Haifeng. Lightweight Underwater Target Detection Algorithm Based on YOLOv8[J]. Infrared Technology , 2024, 46(9): 1015-1024.
    [2]MOU Xingang, ZHU Tailong, ZHOU Xiao. Infrared Image Non-uniformity Correction Algorithm Based on Lightweight Multiscale Downsampling Network[J]. Infrared Technology , 2024, 46(5): 501-509.
    [3]LI Jiayang, ZHOU Yingyue, YANG Yang, LI Xiaoxia. High-Security Finger Vein Recognition System Using Lightweight Neural Network[J]. Infrared Technology , 2024, 46(2): 168-175.
    [4]ZHOU Jinjie, JI Li, ZHANG Qian, ZHANG Baohui, YUAN Xilin, LIU Yanqing, YUE Jiang. Multiscale Infrared Object Detection Network Based on YOLO-MIR Algorithm[J]. Infrared Technology , 2023, 45(5): 506-512.
    [5]SHENG Dajun, ZHANG Qiang. Infrared Armored Target Detection Based on Edge-perception in Deep Neural Network[J]. Infrared Technology , 2021, 43(8): 784-791.
    [6]YANG Qili, ZHOU Binghong, ZHENG Wei, LI Mingtao. Small Infrared Target Detection Based on Fully Convolutional Network[J]. Infrared Technology , 2021, 43(4): 349-356.
    [7]CHEN Gao, WANG Weihua, LIN Dandan. Infrared Vehicle Target Detection Based on Convolutional Neural Network without Pre-training[J]. Infrared Technology , 2021, 43(4): 342-348.
    [8]ZUO Cen, YANG Xiujie, ZHANG Jie, WANG Xuan. Super-resolution Enhancement of Infrared Images Using a Lightweight Dense Residual Network[J]. Infrared Technology , 2021, 43(3): 251-257.
    [9]YI Shi, ZHOU Siyao, SHEN Lian, ZHU Jinming. Vehicle-based Thermal Imaging Target Detection Method Based on Enhanced Lightweight Network[J]. Infrared Technology , 2021, 43(3): 237-245.
    [10]SHEN Xu, MENG Wei, CHENG Xiaohui, WANG Xinzheng. Object Tracking and Recapture Model Based on Deep Detection Network Under Airborne Platform[J]. Infrared Technology , 2020, 42(7): 624-631.

Catalog

    Article views (76) PDF downloads (22) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return