TANG Yifan, HU Xuran, LUO Xi, HUANG Juanjuan, DAI Chaolan, LI Lanting. Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection[J]. Infrared Technology .
Citation: TANG Yifan, HU Xuran, LUO Xi, HUANG Juanjuan, DAI Chaolan, LI Lanting. Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection[J]. Infrared Technology .

Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection

  • Small object detection in UAV-based visible and infrared imagery remains challenging due to scale variation, weak thermal signals, and complex background interference. This paper proposes a dual-modality detection model that integrates receptive field enhancement and global cross-scale semantic fusion, built upon the YOLOv11 architecture. A reparameterized receptive field attention convolution (RFAConv) module expands shallow-layer receptive fields via a dual-branch structure to improve spatial sensitivity and modality adaptability. A Transformer-guided global fusion mechanism aligns multi-scale semantics non-locally, and a mixed local channel attention module enhances focus on small-object regions while suppressing noise. Experiments on VisDrone2021 and HIT-UAV datasets show that the proposed method achieves superior accuracy, structural efficiency, and robustness compared to existing lightweight and Transformer-based detectors.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return