Abstract:
To address the challenges of low contrast, low signal-to-noise ratio, and low resolution in infrared images, this study proposes an infrared object detection network that combines traditional image processing methods with deep learning technology for feature enhancement and fusion. The main steps in this approach are as follows. 1) Preprocessing: The network employs image filtering, sharpening, and equalization methods to highlight object features in the infrared image and enrich the input information. 2) Feature Extraction: A multi-level information aggregation feature extraction structure has been designed to fully extract and integrate the spatial and semantic information of objects, addressing both single-dimension and multi-dimension features. 3) Attention mechanism: To improve the weighting of key features in the extraction structure, a hybrid attention mechanism is introduced. This captures global context information in multiple ways, enhancing both spatial and channel information. 4) Feature fusion: An adaptive weighting method is applied to fuse features from adjacent dimensions, ensuring accurate and efficient detection of infrared objects. Experimental results on the KAIST, FLIR, and RGBT datasets show that the proposed method significantly improves the performance of infrared object detection compared to existing neural network-based methods. Additionally, this method demonstrates higher adaptability in complex scenes compared to other similar algorithms.