Citation: | WU Jintao, WANG Anzhi, REN Chunhong. RGB-T Salient Object Detection: A Survey[J]. Infrared Technology , 2025, 47(1): 1-9. |
In addition to RGB images, thermal IR images can be used to extract salient information, which is crucial for salient object detection. With the development and popularization of IR sensing equipment, thermal IR images have become readily available, and RGB-T salient object detection has become a popular research topic. However, there is currently a lack of comprehensive surveys on the existing methods. First, we briefly introduce machine learning-based RGB-T salient object detection methods and then focus on two types of deep learning methods based on CNNs and vision transformers. Subsequently, relevant datasets and evaluation metrics are introduced, and both qualitative and quantitative comparative analyses are conducted on representative methods using these datasets. Finally, challenges and future development directions for RGB-T salient object detection are summarized and discussed.
[1] |
XU H, ZHANG H, MA J Y. Classification saliency-based rule for visible and infrared image fusion[J]. IEEE Transactions on Computational Imaging, 2021, 7: 824-836. DOI: 10.1109/TCI.2021.3100986
|
[2] |
LI G Y, WANG Y K, LIU Z, et al. RGB-T semantic segmentation with location, activation, and sharpening [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(3): 1223-1235. DOI: 10.1109/TCSVT.2022.3208833
|
[3] |
侯毅苇, 李林汉, 王彦. 结合红外显著性目标导引的改进YOLO网络的智能装备目标识别研究[J]. 红外技术, 2020, 42(7): 644-650. http://hwjs.nvir.cn/article/id/hwjs202007007
HOU Yiwei, LI Linhan, WANG Yan. Intelligent equipment object recognition based on improved YOLO network guided by infrared saliency detection[J]. Infrared Technology, 2020, 42(7): 644-650. http://hwjs.nvir.cn/article/id/hwjs202007007
|
[4] |
Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. DOI: 10.1109/34.730558
|
[5] |
LI C L, CHENG H, HU S Y, et al. Learning collaborative sparse representation for grayscale-thermal tracking[J]. IEEE Transactions on Image Processing, 2016, 25(12): 5743-5756. DOI: 10.1109/TIP.2016.2614135
|
[6] |
张骏, 张鹏, 张政, 等. 类HED网络的热红外图像显著性人体检测深度网络[J]. 红外技术, 2023, 45(6): 649-657. http://hwjs.nvir.cn/article/id/bc2b522e-24dc-4229-8ed3-0b973874e0f4
ZHANG Jun, ZHANG Peng, ZHANG Zheng, et al. Similar HED-Net for salient human detection in thermal infrared images[J]. Infrared Technology, 2023, 45(6): 649-657. http://hwjs.nvir.cn/article/id/bc2b522e-24dc-4229-8ed3-0b973874e0f4
|
[7] |
WANG G Z, LI C L, MA Y P, et al. RGB-T saliency detection benchmark: dataset, baselines, analysis and a novel approach[C]//IGTA 2018: The 13th Academic Conference on Image Graphics Technology and Application, 2018: 359-369.
|
[8] |
MA Y, SUN D, MENG Q, et al. Learning multiscale deep features and svm regressors for adaptive RGB-T saliency detection[C]//ISCID 2017: 2017 10th International Symposium on Computational Intelligence and Design, 2017: 389-392.
|
[9] |
ZHOU D Y, Weston J, Gretton A, et al. Ranking on data manifolds[C]// NIPS 2003: Advances in Neural Information Processing Systems, 2003: 169-176.
|
[10] |
TU Z Z, XIA T, LI C L, et al. M3S-NIR: multi-modal multi-scale noise-insensitive ranking for RGB-T saliency detection[C]// MIPR 2019: 2019 IEEE Conference on Multimedia Information Processing and Retrieval, 2019: 141-146.
|
[11] |
HUANG L M, SONG K C, WANG J, et al. Multi-graph fusion and learning for RGBT image saliency detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(3): 1366-1377. DOI: 10.1109/TCSVT.2021.3069812
|
[12] |
HUANG L M, SONG K C, GONG A J, et al. RGB-T saliency detection via low-rank tensor learning and unified collaborative ranking[J]. IEEE Signal Processing Letters, 2020, 27: 1585-1589. DOI: 10.1109/LSP.2020.3020735
|
[13] |
张冬明, 靳国庆, 代锋, 等. 基于深度融合的显著性目标检测算法[J]. 计算机学报, 2019, 42(9): 2076-2086.
ZHANG D M, JIN G Q, DAI F. Sailent object detection based on deep fusion of hand-craft features[J]. Chinese Journal of Computers, 2019, 42(9): 2076-2086.
|
[14] |
Sandler M, Howard A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// CVPR 2018: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
|
[15] |
TU Z Z, XIA T, LI C L, et al. RGB-t image saliency detection via collaborative graph learning[J]. IEEE Transactions on Multimedia, 2020, 22(1): 160-173. DOI: 10.1109/TMM.2019.2924578
|
[16] |
PANG Y, WU H, WU C D. Cross-modal co-feedback cellular automata for RGB-T saliency detection[J]. Pattern Recognition, 2023, 135: 109-138.
|
[17] |
LIU Z Y, HUANG X S, ZHANG G H et al. Scribble-supervised RGB-T salient object detection[C]//ICME 2023: Proceedings of the IEEE International Conference on Multimedia and Expo, 2023: 2369-2374.
|
[18] |
ZHANG Q, HUANG N C, YAO L, et al. RGB-T salient object detection via fusing multi-level CNN features[J]. IEEE Transactions on Image Processing, 2020, 29: 3321-3335. DOI: 10.1109/TIP.2019.2959253
|
[19] |
ZHANG Q, HUANG N C, XIAO T, et al. Revisiting feature fusion for RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(5): 1804-1818.
|
[20] |
BI H B, WU R W, LIU Z Q, et al. PSNet: parallel symmetric network for RGB-T salient object detection[J]. Neurocomputing, 2022, 511: 410-425. DOI: 10.1016/j.neucom.2022.09.052
|
[21] |
TU Z Z, MA Y, LI Z, et al. RGBT salient object detection: a large-scale dataset and benchmark[J]. IEEE Transactions on Multimedia, 2022, 25: 4163-4176.
|
[22] |
TU Z Z, LI Z, LI C L, et al. Multi-interactive dual-decoder for RGB-thermal salient object detection[J]. IEEE Transactions on Image Processing, 2021, 30: 5678-5691. DOI: 10.1109/TIP.2021.3087412
|
[23] |
WANG J, SONG K C, BAO Y Q, et al. CGFNet: cross-guided fusion network for RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(5): 2949-2961. DOI: 10.1109/TCSVT.2021.3099120
|
[24] |
CHEN Q, LIU Z, ZHANG Y, et al. RGB-D Salient Object Detection via 3D Convolutional Neural Networks[C]// Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 1063-1071.
|
[25] |
CHEN G, SHAO F, CHAI X L, et al. CGMDRNet: cross-guided modality difference reduction network for RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(9): 6308-6323. DOI: 10.1109/TCSVT.2022.3166914
|
[26] |
LIAO G B, GAO W, LI G, et al. Cross-collaborative fusion-encoder network for robust rgb-thermal salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(11): 7646-7661. DOI: 10.1109/TCSVT.2022.3184840
|
[27] |
CONG R M, ZHANG K P, ZHANG C, et al. Does thermal really always matter for RGB-T salient object detection?[J]. IEEE Transactions on Multimedia, 2022, 25: 1-12.
|
[28] |
LIANG Y H, QIN G H, SUN M H, et al. Multi-modal interactive attention and dual progressive decoding network for RGB-D/T salient object detection[J]. Neurocomputing, 2022, 490: 132-145. DOI: 10.1016/j.neucom.2022.03.029
|
[29] |
GAO W, LIAO G B, MA S W, et al. Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(4): 2091-2106. DOI: 10.1109/TCSVT.2021.3082939
|
[30] |
PANG Y W, ZHAO X Q, ZHANG L H, et al. CAVER: cross-modal view-mixed transformer for bi-modal salient object detection[J]. IEEE Transactions on Image Processing, 2023, 32: 892-904.
|
[31] |
ZHOU W J, GUO Q L, LEI J S, et al. ECFFNet: effective and consistent feature fusion network for RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(3): 1224-1235. DOI: 10.1109/TCSVT.2021.3077058
|
[32] |
ZHOU W J, ZHU Y, LEI J S, et al. LSNet: lightweight spatial boosting network for detecting salient objects in RGB-thermal images[J]. IEEE Transactions on Image Processing, 2023, 32: 1329-1340. DOI: 10.1109/TIP.2023.3242775
|
[33] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//NIPS 2017: Advances in Neural Information Processing Systems, 2017: 6000-6010.
|
[34] |
WANG W H, XIE E Z, LI X, et al. PVTv2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2021, 8: 415-424.
|
[35] |
LIU Z Y, TAN Y C, HE Q, et al. SwinNet: swin transformer drives edge-aware RGB-D and RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4486-4497. DOI: 10.1109/TCSVT.2021.3127149
|
[36] |
CHEN G, SHAO F, CHAI X L, et al. Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(4): 1787-1801.
|
[37] |
TANG B, LIU Z Y, TAN Y C, et al. HRTransNet: HRFormer-driven two-modality salient object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(2): 728-742.
|
[38] |
YUAN Y H, FU R, HUANG L, et al. HRFormer: high-resolution vision transformer for dense predict[C]//NIPS 2021: Advances in Neural Information Processing Systems, Virtual, 2021: 7281-7293.
|
[39] |
FAN D P, CHENG M M, LIU Y, et al. Structure-measure: a new way to evaluate foreground maps[C]//ICCV 2017: Proceedings of the 2017 IEEE/CVF International Conference on Computer Vision, 2017: 4558-4567.
|
[40] |
FAN D P, GONG C, CAO Y, et al. Enhanced-alignment measure for binary foreground map evaluation[C]//IJCAI 2018: The 27th International Joint Conference on Artificial Intelligence, 2018: 698-704.
|
[41] |
YAN Q, XU L, SHI J P, et al. Hierarchical saliency detection[C]//CVPR 2013: Proceedings of the 2013 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2013: 1155-1162.
|
[42] |
LIN Y, HOU X D, Koch C, et al. The secrets of salient object segmentation[C]//CVPR 2014: Proceedings of the 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2014: 280-287.
|
1. |
朱家乙,杨宏双,何伟,王伟男,沙怡中,黄晓江,许桢杰. 一种基于区域分割的直方图均衡算法. 红外技术. 2022(06): 587-592 .
![]() | |
2. |
聂丰英,侯利霞,万里勇. 自适应双边滤波与方向梯度的红外图像增强. 红外技术. 2022(12): 1309-1315 .
![]() | |
3. |
刘慧舟,胡瑾秋,张来斌,张彪. 基于红外热成像与CNN的压裂装备故障精准识别及预警. 中国石油大学学报(自然科学版). 2021(01): 158-166 .
![]() | |
4. |
郭晓川. 基于小波变换的红外图像增强算法研究. 单片机与嵌入式系统应用. 2021(08): 21-25 .
![]() | |
5. |
刘永江,杨耿煌,董建,刘易. 高动态范围红外图像压缩与细节增强算法. 天津职业技术师范大学学报. 2021(04): 52-57 .
![]() | |
6. |
葛朋,杨波,洪闻青,王晓东,刘传明,苏兰,苏俊波. 一种结合PE的高动态范围红外图像压缩及细节增强算法. 红外技术. 2020(03): 279-285 .
![]() | |
7. |
李牧,周瑞杰,田哲嘉. 基于直方图的热红外图像增强方法. 红外技术. 2020(09): 880-885 .
![]() | |
8. |
邓超迪,李川,李英娜. 基于直方图均衡化和双边滤波的变压器红外图像增强. 电力科学与工程. 2020(11): 38-44 .
![]() | |
9. |
杨道静. 小尺度图像细节层的高频分量提取仿真. 计算机仿真. 2019(06): 430-433 .
![]() | |
10. |
李平,邓诗元. 动态图像序列影像检测仿真研究. 计算机仿真. 2018(05): 363-366 .
![]() | |
11. |
袁云梅,多化琼,马坤. 基于权重系数的木材图像增强及识别. 西北林学院学报. 2018(02): 209-212 .
![]() | |
12. |
曾稳情,廖胜,王万平,魏红艳. 一种改进的红外图像增强算法及其在FPGA上的实现. 电子设计工程. 2017(24): 170-175 .
![]() | |
13. |
王坤. 红外视频图像细节保持增强算法研究. 电子制作. 2016(18): 33-34 .
![]() |