Abstract:
An improved object detection method (YOLO with EffectiveSE, Focal-EIOU, DCNv2, CARAFE, and DyHead) is proposed based on YOLOv5 to address issues in underwater waste infrared target detection, such as blurred boundary details, low image quality, and the presence of various irregular or damaged coverings. The InceptionNeXt network is selected as the backbone network to enhance the model's expressive power and feature extraction capability. Additionally, the EffectiveSE attention mechanism is introduced in the feature fusion layer to adaptively learn the importance of feature channels and selectively weight them. Deformable convolutions are used to replace the C3 module in the original model, enabling it to better perceive the shapes and details of the targets. Moreover, the CARAFE operator is employed to replace the upsampling module, thereby enhancing the representation ability of the fine-grained features and avoiding information loss. In terms of the loss function, the Focal-EIOU loss function is adopted to improve the accuracy of the model in target localization and bounding box regression. Finally, DyHead is introduced to replace the head of YOLOv5, thereby enhancing the model accuracy via dynamic receptive field mechanisms and multiscale feature fusion. The improved EFDCD-YOLO model is applied to underwater waste infrared target detection and compared to the YOLOv5 model. The model achieves a 21.4% improvement in precision (
P), 9.7% improvement in recall (R), and 13.6% improvement in mean average precision (mAP). The experimental results demonstrate that EFDCD-YOLO effectively enhances the detection performance in underwater waste infrared target detection scenarios and effectively meets the requirements of underwater infrared target detection.