Citation: | LI Minglu, WANG Xiaoxia, HOU Maoxin, YANG Fengbao. An Object Detection Algorithm Based on Infrared-Visible Feature Enhancement and Fusion[J]. Infrared Technology , 2025, 47(3): 385-394. |
A dual-branch feature enhancement and fusion backbone network (DBEF-Net) is proposed for object detection to address the challenges of infrared and visible bimodal object detection in complex dynamic environments. Specifically, DBEF-Net addresses issues such as insufficient object feature expression and the inability of infrared and visible features to fully utilize the complementary features in bimodal fusion leading to omission and misdetection. To further address the insufficient attention of the model to infrared and visible light features, a feature interaction enhancement module is designed to effectively focus on and enhance the useful information in bimodal features. A transformer-based bimodal fusion network is further adopted. To utilize the complementary features of bimodal modalities more effectively, a cross-attention mechanism is introduced to achieve deep fusion between the modalities. Experimental results show that the proposed method has higher average detection accuracy than existing bimodal object detection algorithms on the SYUGV dataset, meeting the processing speed for real-time detection.
[1] |
Ramachandran A, Sangaiah A K. A review on object detection in unmanned aerial vehicle surveillance[J]. International Journal of Cognitive Computing in Engineering, 2021, 2: 215-228.
|
[2] |
HU Y, SHI L, YAO L, et al. Dual attention feature fusion for visible-infrared object detection[C]//International Conference on Artificial Neural Networks, 2023: 53-65.
|
[3] |
宁大海, 郑晟. 可见光和红外图像决策级融合目标检测算法[J]. 红外技术, 2023, 45(3): 282-291. http://hwjs.nvir.cn/article/id/5340b616-c317-4372-9776-a7c81ca2c729
NING Dahai, ZHENG Sheng. An object detection algorithm based on decision-level fusion of visible and infrared images[J]. Infrared Technology, 2023, 45(3): 282-291. http://hwjs.nvir.cn/article/id/5340b616-c317-4372-9776-a7c81ca2c729
|
[4] |
Bustos N, Mashhadi M, Lai-Yuen S K, et al. A systematic literature review on object detection using near infrared and thermal images[J]. Neurocomputing, 2023, 560: 126804.
|
[5] |
YUE G, LI Z, TAO Y, et al. Low-illumination traffic object detection using the saliency region of infrared image masking on infrared-visible fusion image[J]. Journal of Electronic Imaging, 2022, 31(3): 033029-033029.
|
[6] |
LIU J, FAN X, HUANG Z, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5802-5811.
|
[7] |
TANG Cong, LING Yongshun, YANG Hua, et al. Decision-level fusion detection for infrared and visible spectra based on deep learning[J]. Infrared and Laser Engineering, 2019, 48(6): 626001-0626001(15).
|
[8] |
SUN Y M, CAO B, ZHU P F, et al. Drone-based RGB-Infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuitsand Systems for Video Technology, 2022, 32: 6700-6713.
|
[9] |
GENG K K, ZOU W, YIN G D, et al. Low-observable targets detection for autonomous vehicles based on dual-modal sensor fusion with deep learning approach[J]. Journal of Automobile Engineering, 2019, 233(9): 2270-2283.
|
[10] |
XUE Y, JU Z, LI Y, et al. MAF-YOLO: Multi-modal attention fusion based YOLO for pedestrian detection[J]. Infrared Physics & Technology, 2021, 118: 103906.
|
[11] |
CHENG X, GENG K, WANG Z, et al. SLBAF-Net: Super-Lightweight bimodal adaptive fusion network for UAV detection in low recognition environment[J]. Multimedia Tools and Applications, 2023, 82(30): 47773-47792.
|
[12] |
SHEN J, CHEN Y, LIU Y, et al. ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection[J]. Pattern Recognition, 2024, 145: 109913.
|
[13] |
Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
|
[14] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
|
[15] |
Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.
|
[16] |
CHEN Z, HE Z, LU Z M. DEA-Net: Single image dehazing based on detail-enhanced convolution and content-guided attention[J]. IEEE Transactions on Image Processing, 2024, 33: 1002-1015.
|
[17] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Neural Information Processing Systems, Neural Information Processing Systems, 2017, 30: 6000-6010.
|
[18] |
FANG Qingyun, HAN Dapeng, WANG Zhaokui. Cross-modality fusion transformer for multispectral object detection[J]. arXiv preprint arXiv: 2111.00273, 2021.
|
[19] |
Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 618-626.
|
[20] |
WANG Q, WU B, ZHU P, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11534-11542.
|
[1] | YE Ye. A Deep Learning Method for Hyperspectral Detection of Heavy Metal Contaminants in Soil Based on Attention Mechanism[J]. Infrared Technology , 2025, 47(4): 453-458. |
[2] | ZHAO Yating, HAN Long, HE Huihuang, CHEN Chu. DSEL-CNN: Image Fusion Algorithm Combining Attention Mechanism and Balanced Loss[J]. Infrared Technology , 2025, 47(3): 358-366. |
[3] | LI Ruihong, FU Zhitao, ZHANG Shaochen, ZHANG Jian, WANG Leiguang. Nighttime Object Detection in Infrared and Visible Images Based on Multi-Attention Mechanism[J]. Infrared Technology , 2024, 46(12): 1371-1379. |
[4] | WANG Yan, ZHANG Jinfeng, WANG Likang, FAN Xianghui. Underwater Image Enhancement Based on Attention Mechanism and Feature Reconstruction[J]. Infrared Technology , 2024, 46(9): 1006-1014. |
[5] | ZHAO Songpu, YANG Liping, ZHAO Xin, PENG Zhiyuan, LIANG Dongxing, LIANG Hongjun. Object Detection in Visible Light and Infrared Images Based on Adaptive Attention Mechanism[J]. Infrared Technology , 2024, 46(4): 443-451. |
[6] | LI Xiangrong, SUN Lihui. Multiscale Infrared Target Detection Based on Attention Mechanism[J]. Infrared Technology , 2023, 45(7): 746-754. |
[7] | CHEN Xin. Infrared and Visible Image Fusion Using Double Attention Generative Adversarial Networks[J]. Infrared Technology , 2023, 45(6): 639-648. |
[8] | WANG Tianyuan, LUO Xiaoqing, ZHANG Zhancheng. Infrared and Visible Image Fusion Based on Self-attention Learning[J]. Infrared Technology , 2023, 45(2): 171-177. |
[9] | LUO Di, WANG Congqing, ZHOU Yongjun. A Visible and Infrared Image Fusion Method based on Generative Adversarial Networks and Attention Mechanism[J]. Infrared Technology , 2021, 43(6): 566-574. |
[10] | WANG Hao, ZHANG Jingjing, LI Yuanyuan, WANG Feng, XUN Lina. Hyperspectral Image Classification Based on 3D Convolution Joint Attention Mechanism[J]. Infrared Technology , 2020, 42(3): 264-271. |