Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection

TANG Yifan; HU Xuran; LUO Xi; HUANG Juanjuan; DAI Chaolan; LI Lanting

TANG Yifan, HU Xuran, LUO Xi, HUANG Juanjuan, DAI Chaolan, LI Lanting. Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object DetectionJ. Infrared Technology .

Citation:

TANG Yifan, HU Xuran, LUO Xi, HUANG Juanjuan, DAI Chaolan, LI Lanting. Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object DetectionJ. Infrared Technology .

Citation:

TANG Yifan, HU Xuran, LUO Xi, HUANG Juanjuan, DAI Chaolan, LI Lanting. Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object DetectionJ. Infrared Technology .

Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection

Abstract

Abstract

Small object detection in UAV-based visible and infrared imagery remains challenging due to scale variation, weak thermal signals, and complex background interference. This paper proposes a dual-modality detection model that integrates receptive field enhancement and global cross-scale semantic fusion, built upon the YOLOv11 architecture. A reparameterized receptive field attention convolution (RFAConv) module expands shallow-layer receptive fields via a dual-branch structure to improve spatial sensitivity and modality adaptability. A Transformer-guided global fusion mechanism aligns multi-scale semantics non-locally, and a mixed local channel attention module enhances focus on small-object regions while suppressing noise. Experiments on VisDrone2021 and HIT-UAV datasets show that the proposed method achieves superior accuracy, structural efficiency, and robustness compared to existing lightweight and Transformer-based detectors.

FullText(HTML)

References (0)

Cited By

Receptive Field Fusion and Cross-Scale Global Modeling for Infrared and Visible Small Object Detection

Abstract

Catalog

Export File

Citation

Format

Content