GAO Yongqi, YUAN Zhixiang. Improved YOLOv5-based Underwater Infrared Garbage Detection Algorithm[J]. Infrared Technology , 2024, 46(9): 994-1005.
Citation: GAO Yongqi, YUAN Zhixiang. Improved YOLOv5-based Underwater Infrared Garbage Detection Algorithm[J]. Infrared Technology , 2024, 46(9): 994-1005.

Improved YOLOv5-based Underwater Infrared Garbage Detection Algorithm

More Information
  • Received Date: September 26, 2023
  • Revised Date: December 12, 2023
  • An improved object detection method (YOLO with EffectiveSE, Focal-EIOU, DCNv2, CARAFE, and DyHead) is proposed based on YOLOv5 to address issues in underwater waste infrared target detection, such as blurred boundary details, low image quality, and the presence of various irregular or damaged coverings. The InceptionNeXt network is selected as the backbone network to enhance the model's expressive power and feature extraction capability. Additionally, the EffectiveSE attention mechanism is introduced in the feature fusion layer to adaptively learn the importance of feature channels and selectively weight them. Deformable convolutions are used to replace the C3 module in the original model, enabling it to better perceive the shapes and details of the targets. Moreover, the CARAFE operator is employed to replace the upsampling module, thereby enhancing the representation ability of the fine-grained features and avoiding information loss. In terms of the loss function, the Focal-EIOU loss function is adopted to improve the accuracy of the model in target localization and bounding box regression. Finally, DyHead is introduced to replace the head of YOLOv5, thereby enhancing the model accuracy via dynamic receptive field mechanisms and multiscale feature fusion. The improved EFDCD-YOLO model is applied to underwater waste infrared target detection and compared to the YOLOv5 model. The model achieves a 21.4% improvement in precision (P), 9.7% improvement in recall (R), and 13.6% improvement in mean average precision (mAP). The experimental results demonstrate that EFDCD-YOLO effectively enhances the detection performance in underwater waste infrared target detection scenarios and effectively meets the requirements of underwater infrared target detection.

  • [1]
    Schechner Y Y, Narasimhan S G, Nayar S K. Instant dehazing of images using polarization[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2001, 1: Ⅰ-Ⅰ.
    [2]
    Bazeilles, Quidui, Jaulinl. Identification of underwater man-made object using a colour criterion[J]. Proceedings of the Insitute of Acoustics, 2007, 29(6): 25-52.
    [3]
    LI C Y, GUO J C, CONG R M, et al. Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior[J]. IEEE Transactions on Image Processing, 2016, 25(12): 5664-5677. DOI: 10.1109/TIP.2016.2612882
    [4]
    Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
    [5]
    Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
    [6]
    Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
    [7]
    Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
    [8]
    LIU W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV, 2016: 21-37.
    [9]
    陈鑫林. 基于深度学习的水下垃圾检测[D]. 贵阳: 贵州师范大学, 2022.

    CHEN Xinlin. Underwater Garbage Detection Based on Deep Learning [D]. Guiyang: Guizhou Normal University, 2022.
    [10]
    袁红春, 臧天祺. 基于注意力机制及Ghost-YOLOv5的水下垃圾目标检测[J]. 环境工程, 2023, 41(7): 214-221. DOI: 10.13205/j.hjgc.202307029.

    YUAN Hongchun, ZANG Tianqi. Underwater garbage target detection based on attention mechanism and Ghost-YOLOv5[J]. Environmental Engineering, 2023, 41(7): 214-221. DOI: 10.13205/j.h JGC.202307029.
    [11]
    JIANG H, Learned Miller E. Face detection with the faster R-CNN[C]//12th IEEE International Conference on Automatic Face & Gesture Recognition, 2017: 650-657.
    [12]
    CAI Z, Vasconcelos N. Cascade R-CNN: High quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(5): 1483-1498.
    [13]
    ZHOU X, WANG D, Krähenbühl P. Objects as points[J]. arXiv preprint arXiv: 1904.07850, 2019.
    [14]
    吕晓倩. 基于Faster R-CNN的水下目标检测方法研究与实现[D]. 哈尔滨: 哈尔滨工业大学, 2019.

    LYU Xiaoqian. Research and Implementation of Underwater Target Detection Method Based on Faster R-CNN [D]. Harbin: Harbin Institute of Technology, 2019.
    [15]
    王蓉蓉, 蒋中云. 基于改进CenterNet的水下目标检测算法[J]. 激光与光电子学进展, 2023, 60(2): 239-248.

    WANG Rongrong, JIANG Zhongyun. Underwater target detection algorithm based on improved CenterNet[J]. Progress in Laser and Optoelectronics, 2023, 60(2): 239-248.
    [16]
    YU W, ZHOU P, YAN S, et al. Inceptionnext: When inception meets convnext[J]. arXiv preprint arXiv: 2303.16900, 2023.
    [17]
    Lee Y, Park J. Centermask: Real-time anchor-free instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 13906-13915.
    [18]
    ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157. DOI: 10.1016/j.neucom.2022.07.042
    [19]
    WANG R, Shivanna R, CHENG D, et al. DCN v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems[C]//Proceedings of the Web Conference, 2021: 1785-1797.
    [20]
    WANG J, CHEN K, XU R, et al. Carafe: Content-aware reassembly of features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 3007-3016.
    [21]
    DAI X, CHEN Y, XIAO B, et al. Dynamic head: Unifying object detection heads with attentions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 7373-7382.
    [22]
    Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
    [23]
    Fulton M, HONG J, Islam M J, et al. Robotic detection of marine litter using deep visual detection models[C]//International Conference on Robotics and Automation (ICRA). IEEE, 2019: 5752-5758.
    [24]
    HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 13713-13722.
    [25]
    LIU Y, SHAO Z, Hoffmann N. Global attention mechanism: retain information to enhance channel-spatial interactions[J]. arXiv preprint arXiv: 2112.05561, 2021.
    [26]
    ZHU L, WANG X, KE Z, et al. BiFormer: vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 10323-10333.
    [27]
    LI X, HU X, YANG J. Spatial group-wise enhance: Improving semantic feature learning in convolutional networks[J]. arXiv preprint arXiv: 1905.09646, 2019.
    [28]
    ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000.
    [29]
    Gevorgyan Z. SIoU loss: More powerful learning for bounding box regression[J]. arXiv preprint arXiv: 2205.12740, 2022.
    [30]
    TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[J]. arXiv preprint arXiv: t2301.10051, 2023.
  • Related Articles

    [1]LI Ziqian, BAN Yanwameng, LIU Yun, HE Dong, DU Rucai. Visible and Infrared Image Matching Method Based on Multi-Scale Feature Point Extraction[J]. Infrared Technology , 2025, 47(3): 351-357.
    [2]QI Yanjie, HOU Qinhe. Infrared and Visible Image Fusion Combining Multi-scale and Convolutional Attention[J]. Infrared Technology , 2024, 46(9): 1060-1069.
    [3]WANG Yan, ZHANG Jinfeng, WANG Likang, FAN Xianghui. Underwater Image Enhancement Based on Attention Mechanism and Feature Reconstruction[J]. Infrared Technology , 2024, 46(9): 1006-1014.
    [4]LI Qiuheng, DENG Hao, LIU Guihua, PANG Zhongxiang, TANG Xue, ZHAO Junqin, LU Mengyuan. Infrared and Visible Images Fusion Method Based on Multi-Scale Features and Multi-head Attention[J]. Infrared Technology , 2024, 46(7): 765-774.
    [5]CHONG Fating, DONG Zhangyu, YANG Xuezhi, ZENG Qingwang. SAR and Multispectral Image Fusion Based on Dual-channel Multi-scale Feature Extraction and Attention[J]. Infrared Technology , 2024, 46(1): 61-73.
    [6]QU Haicheng, HU Qianqian, ZHANG Xuecong. Infrared and Visible Image Fusion Combining Information Perception and Multiscale Features[J]. Infrared Technology , 2023, 45(7): 685-695.
    [7]LI Yueyi, DING Hongchang, ZHANG Lei, ZHAO Changfu, ZHANG Shibo, WANG Aijia. Pupil Diopter Detection Approach Based on Improved YOLOv3[J]. Infrared Technology , 2022, 44(7): 702-708.
    [8]WANG Fang, LI Chuanqiang, WU Bo, YU Kun, JIN Chan, CHEN Yake, LU Yinghui. Infrared Small Target Detection Method Based on Multi-Scale Feature Fusion[J]. Infrared Technology , 2021, 43(7): 688-695.
    [9]ZHANG Hao, LI Na, WANG Lu. Fast Multi-sensor Image Matching Algorithm Based on a Multi-scale Dense Structure Feature[J]. Infrared Technology , 2020, 42(5): 420-425.
    [10]SUN Shixin, ZHENG Zhiyun. Genetic Algorithm for Infrared Multi-target Detection Based on Multi-scale NNLoG Feature[J]. Infrared Technology , 2019, 41(9): 837-842.
  • Cited by

    Periodical cited type(2)

    1. 朱敏鸣,应祥岳. 无人机视觉小目标检测的改进YOLOv8s算法研究. 企业科技与发展. 2025(02): 90-95 .
    2. 班国邦,付磊,蒋理,杜昊,黎安俊,何雨昱,周骏超. 基于图像筛选的两阶段复杂作业人员行为动态风险辨识. 电力大数据. 2024(08): 58-69 .

    Other cited types(0)

Catalog

    Article views PDF downloads Cited by(2)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return