Abstract:
Infrared images captured by drones contain limited textural information and weak edge features, making it difficult for lightweight networks to improve the detection accuracy of small-scale targets in infrared images. Based on the YOLOv8N network, the last layer of the original network was first trimmed to reduce the number of network parameters and improve the loss of small target detail features caused by the deep convolutional neural network. Subsequently, based on the efficient multi-scale attention (EMA) of cross-spatial learning, the feature-extraction module C2f-EMA was constructed, which retained and highlighted the small target features to the greatest extent while suppressing the interference of the background environment through channel remodeling and dimensional grouping. Finally, the WIoU loss function was introduced to replace the original network CIoU loss function and realize the dynamic adjustment of the target weight to further improve the detection performance of the network for small targets. The experimental results show that compared with other advanced networks, such as YOLOv8n and PiCoDet on the HIT-UAV dataset, UIDNet has a smaller model size and better detection effect than the original YOLOv8n model. The average detection accuracy of UIDNet was increased by 1.7%, the number of parameters was reduced by 67.4%, and the model volume was compressed by 63.5% to only 2.3 MB.