Abstract:
To address the challenges of limited computational resources and multiscale small object detection in drone aerial imagery, a lightweight detection model, YOLOv10n-CIG, is proposed based on the YOLOv10n architecture. First, the C2f-CW (C2f with Convolutional Wise) module is designed to replace the conventional C2f module. By combining Partial Convolution (PConv) and Pointwise Convolution (PWConv), this new module optimizes computational resources, accelerates inference speed, and enhances multi-scale feature fusion. Second, the last downsampling layer in the Backbone is removed, and the Spatial Pyramid Pooling Fast (SPPF) module is improved to SPPF with Involution Parallel Structure (SPPF-IP) to retain the fine-grained spatial information of small targets, further improving the multi-scale feature fusion performance. Finally, a lightweight detection head, GHead (GConv Head), based on Group Convolution (GConv), is introduced. By optimizing the parameters of the group convolution, a balance between the detection accuracy, model size, and inference speed is achieved. The experimental results indicate that compared with the original YOLOv10n model, the YOLOv10n-CIG model achieves a 5.3% improvement in mAP50, reduces the model size by 1.12 MB, and increases the inference speed by 59 FPS on Ubuntu and 9 FPS on Jetson. Compared with current mainstream algorithms, YOLOv10n-CIG exhibits superior overall performance across various metrics.