SHEN Lingyun, LANG Baihe, SONG Zhengxun, WEN Zhitao. Remote Sensing Image Target Detection Method Based on CSE-YOLOv5[J]. Infrared Technology , 2023, 45(11): 1187-1197.
Citation: SHEN Lingyun, LANG Baihe, SONG Zhengxun, WEN Zhitao. Remote Sensing Image Target Detection Method Based on CSE-YOLOv5[J]. Infrared Technology , 2023, 45(11): 1187-1197.

Remote Sensing Image Target Detection Method Based on CSE-YOLOv5

More Information
  • Received Date: June 06, 2023
  • Revised Date: August 06, 2023
  • We proposed a new object detection method based on the CSE-YOLOv5 (CBAM-SPPF-EIoU-YOLOv5) model for insufficient multi-scale feature learning ability and the difficulty of balancing detection accuracy and model parameter quantity in remote sensing image object detection algorithms in complex task scenarios. We built this method on the YOLOv5 model's backbone network framework and introduced a convolutional attention mechanism layer into the shallow layers to enhance the model's ability to extract refined features and suppress redundant information interference. In the deep layers, we constructed a spatial pyramid pooling fast (SPPF) with a tandem construction module and improved the statistical pooling method to fuse multi-scale key feature information from shallow to deep. In addition, we further enhanced the multi-scale feature learning ability by optimizing the anchor box mechanism and improving the loss function. The experimental results demonstrated the superior performance of the CSE-YOLOv5 series models on the publicly available datasets RSOD, DIOR, and DOTA. The average mean precisions (mAP@0.5) were 96.8%, 92.0%, and 71.0% for RSOD, DIOR, and DOTA, respectively. Furthermore, the average mAP@0.5:0.95 at a wider IoU range of 0.5 to 0.95 achieved 87.0%, 78.5%, and 61.9% on the same datasets. The inference speed of the model satisfied the real-time requirements. Compared to the YOLOv5 series models, the CSE-YOLOv5 model exhibited significant performance enhancements and surpassed other mainstream models in object detection.
  • [1]
    WANG K, LI Z, SU A, et al. Oriented object detection in optical remote sensing images: a survey[J/OL]. Computer Science, 2023, https://arxiv.org/abs/2302.10473.
    [2]
    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
    [3]
    Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. DOI: 10.1145/3065386
    [4]
    Girshick R. Fast R-CNN[C]//IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448.
    [5]
    LIU Wei, Dragomir Anguelov, Dumitru Erhan, et al. SSD: single shot multibox detector[J/OL]. Computer Science, 2015, https://arxiv.org/abs/1512.02325.
    [6]
    LIN Tsungyi, Goyal Priya, Girshick Ross, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. DOI: 10.1109/TPAMI.2018.2858826
    [7]
    REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788.
    [8]
    ZHANG S, WEN L, BIAN X, et al. Single-shot refinement neural network for object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 4203-4212, Doi: 10.1109/CVPR.2018.00442.
    [9]
    CHEN H B, JIANG S, HE G, et al. TEANS: A target enhancement and attenuated no maximum suppression object detector for remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 18(4): 632-636.
    [10]
    HOU L, LU K, XUE J, et al. Cascade detector with feature fusion for arbitrary-oriented objects in remote sensing images[C]//IEEE International Conference on Multimedia and Expo (ICME), 2020: 1-6. Doi: 10.1109/ICME46284.2020.9102807.
    [11]
    LU X, JI J, XING Z, et al. Attention and feature fusion SSD for remote sensing object detection[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1-9.
    [12]
    LI Q, MOU L, LIU Q, et al. HSF-Net: multiscale deep feature embedding for ship detection in optical remote sensing imagery[J/OL]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(12): 7147-7161.
    [13]
    DONG R C, XU D Z, ZHAO J, et al. Sig-NMS-based faster R-CNN combining transfer learning for small target detection in VHR optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 8534-8545. DOI: 10.1109/TGRS.2019.2921396
    [14]
    LI C, LUO B, HONG H, et al. Object detection based on global-local saliency constraint in aerial images[J/OL]. Remote Sensing, 2020, 12(9): 1435, https://doi.org/10.3390/rs12091435.
    [15]
    ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021: 2778-2788.
    [16]
    YANG X, YAN J, FENG Z, et al. R3Det: Refined single-stage detector with feature refinement for rotating object[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 3163-3171.
    [17]
    QING Y, LIU W, FENG L, et al. Improved YOLO network for free-angle remote sensing target detection[J]. Remote Sensing, 2021, 13(11): 2171. DOI: 10.3390/rs13112171
    [18]
    LONG Y, GONG Y, XIAO Z, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486-2498. DOI: 10.1109/TGRS.2016.2645610
    [19]
    XU D, WU Y. FE-YOLO: A feature enhancement network for remote sensing target detection[J]. Remote Sensing, 2021, 13(7): 1311. DOI: 10.3390/rs13071311
    [20]
    CHEN L, SHI W, DENG D. Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13(4): 660. DOI: 10.3390/rs13040660
    [21]
    XU D, WU Y. Improved YOLO-V3 with DenseNet for multi-scale remote sensing target detection[J]. Sensors, 2020, 20(15): 4276. DOI: 10.3390/s20154276
    [22]
    赵玉卿, 贾金露, 公维军, 等. 基于pro-YOLOv4的多尺度航拍图像目标检测算法[J]. 计算机应用研究, 2021, 38(11): 3466-3471. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202111048.htm

    ZHAO Y Q, JIA J L, GONG W J, et al. Multi-scale aerial image target detection algorithm based on pro-YOLOv4[J]. Application Research of Computers, 2021, 38(11): 3466-3471. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202111048.htm
    [23]
    Gevorgyan Z. SIoU Loss: more powerful learning for bounding box regression[J/OL]. Computer Science, 2022, https://arxiv.org/abs/2205.12740.
    [24]
    王建军, 魏江, 梅少辉, 等. 面向遥感图像小目标检测的改进YOLOv3算法[J]. 计算机工程与应用, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm

    WANG J J, WEI J, MEI S H, et al. Improved Yolov3 for small object detection in remote sensing image[J]. Computer Engineering and Applications, 2021, 57(20): 133-141. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202120016.htm
    [25]
    XU Z, XU X, WANG L, et al. Deformable ConvNet with aspect ratio constrained NMS for object detection in remote sensing imagery[J]. Remote Sensing, 2017, 9(12): 1312. DOI: 10.3390/rs9121312
    [26]
    Sanghyun Woo, Jongchan Park, Joon-Young Lee, et al. CBAM: convolutional block attention module[J/OL]. Computer Science, 2018, https://arxiv.org/abs/1807.06521.
  • Related Articles

    [1]CHEN Zhuang, HE Feng, HONG Xiaohang, ZHANG Qiran, YANG Yuyan. Embedded Platform IR Small-target Detection Based on Self-attention and Convolution Fused Architecture[J]. Infrared Technology , 2025, 47(1): 89-96.
    [2]DI Jing, LIANG Chan, REN Li, GUO Wenqing, LIAN Jing. Infrared and Visible Image Fusion Based on Multi-Scale Contrast Enhancement and Cross-Dimensional Interactive Attention Mechanism[J]. Infrared Technology , 2024, 46(7): 754-764.
    [3]ZHAO Songpu, YANG Liping, ZHAO Xin, PENG Zhiyuan, LIANG Dongxing, LIANG Hongjun. Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism[J]. Infrared Technology , 2024, 46(4): 443-451.
    [4]HE Le, LI Zhongwei, LUO Cai, REN Peng, SUI Hao. Infrared and Visible Image Fusion Based on Dilated Convolution and Dual Attention Mechanism[J]. Infrared Technology , 2023, 45(7): 732-738.
    [5]CHEN Xin. Infrared and Visible Image Fusion Using Double Attention Generative Adversarial Networks[J]. Infrared Technology , 2023, 45(6): 639-648.
    [6]CHEN Yanlin, WANG Zhishe, SHAO Wenyu, YANG Fan, SUN Jing. Multi-scale Transformer Fusion Method for Infrared and Visible Images[J]. Infrared Technology , 2023, 45(3): 266-275.
    [7]WANG Tianyuan, LUO Xiaoqing, ZHANG Zhancheng. Infrared and Visible Image Fusion Based on Self-attention Learning[J]. Infrared Technology , 2023, 45(2): 171-177.
    [8]HUANG Linglin, LI Qiang, LU Jinzheng, HE Xianzhen, PENG Bo. Infrared and Visible Image Fusion Based on Multi-scale and Attention Model[J]. Infrared Technology , 2023, 45(2): 143-149.
    [9]CHEN Da, HE Quancai, DI Erzhen, DENG Zaozhu. Application of Partial Differential Segmentation Model with Adaptive Weight in Infrared Image of Substation Equipment[J]. Infrared Technology , 2022, 44(2): 179-188.
    [10]WU Yuanyuan, WANG Zhishe, WANG Junyao, SHAO Wenyu, CHEN Yanlin. Infrared and Visible Image Fusion Using Attention- Based Generative Adversarial Networks[J]. Infrared Technology , 2022, 44(2): 170-178.
  • Cited by

    Periodical cited type(1)

    1. 杨晓超,郝慧良. 矿用电缆放电监测系统研究设计. 中国煤炭. 2024(S1): 406-410 .

    Other cited types(0)

Catalog

    Article views (149) PDF downloads (48) Cited by(1)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return