Abstract:
To address the challenge of significant differences in target scales and complex backgrounds in optical remote sensing image target detection, this paper presents an optical remote sensing image target detection algorithm based on YOLOv7. The algorithm optimizes the feature extraction and fusion processes. First, a feature extraction module combining a CNN and Transformer is introduced to better capture the global information of the image. Subsequently, a bidirectional fusion structure is designed to enhance the fusion effect between the shallow and deep features. The experimental results demonstrate that the proposed method achieves mAP@0.5 of 96.6% and 97.6% on the NWPU VHR-10 and RSOD datasets, respectively, surpassing that achieved by the YOLOv7 algorithm by 3.2% and 4.2%, respectively; thus, effectively enhancing the accuracy of remote sensing image target detection.