Abstract:
To address the challenges of detecting small targets in complex backgrounds and the low detection accuracy of UAV aerial imagery, an improved algorithm based on YOLOv8n for aerial image detection, named Focused Small Target Detection-YOLO(FSTD-YOLO) is proposed. A novel Multi-Receptive Field Feature Extraction Module(MRFF) is presented to enhance the learning capability of complex backgrounds and avoid mitigating issues such as the loss of small target information and the lack of feature learning. The use of micro target detection heads and the removal of inefficient large target detection heads, which makes the model more focus on the detection of small targets and reduce the concern of targets with irrelevant sizes. Additionally, a multi-attention mechanism module is introduced to enhance the ability of recognizing multi-scale targets. In the experiments, the Inner-MPDIoU loss function is adopted to replace the CIoU loss function, providing a more precise measure of the difference between predicted and ground truth bounding boxes and facilitating faster convergence for low IoU samples in aerial images. Experimental results demonstrate that, compared to YOLOv8n, FSTD-YOLO achieves improvements of 9.1% in mAP@0.5 and 6.1% in mAP@0.5~0.95 on the VisDrone dataset. Furthermore, the number of parameters is reduced by 13.0%, and the weight file size is reduced by 11.1%, making it more suitable for deployment on mobile devices.