FVIT-YOLO v8: Improved YOLO v8 Small Object Detection Based on Multi-scale Fusion Attention Mechanism
-
-
Abstract
This study investigates the problem of small-target detection in remote sensing and drone aerial images. These images have the characteristics of a small target scale, dense target distribution, and complex background, which makes feature extraction difficult. Most current algorithms for small-target detection ignore the impact of parameter quantity and inference speed on the practicality of the algorithm to improve accuracy. Therefore, this algorithm is impractical. To address these problems, this study proposes an improved YOLO v8 small target detection algorithm based on a lightweight multiscale fusion attention mechanism. The algorithm first adds the F operator to the FPN structure of YOLO v8, designs the weighted fusion of multiscale features, removes the P4 and P5 prediction layers in the network prediction layer, adds a P2 layer for small target prediction, improves the image input grid segmentation integration of the lightweight attention mechanism, and replaces the C2f module in the improved FPN with it, thereby improving the algorithm have better global perception ability and greatly reducing the parameter quantity. Compared to YOLO v8s, the mAP of this algorithm on the DOTA dataset increased by 4.4%, the network parameter quantity was reduced by 52%, and the FPS reached 46 frames. For the VisDrone dataset, this algorithm improved the accuracy by 8.3%.
-
-