Citation: | MIN Feng, LIU Biao, KUANG Yonggang, MAO Yixin, LIU Yuhui. Spatially Adaptive and Content-Aware Infrared Small Target Detection[J]. Infrared Technology , 2024, 46(7): 735-742. |
Owing to the scarcity of pixel values and limited color features in infrared street images, issues such as missed detections, false detections, and poor detection performance are common. To address these problems, a spatially adaptive and content-aware infrared small object detection algorithm is proposed. The key components of this algorithm are as follows. 1) Spatially adaptive transformer: This transformer is designed by stacking local attention and deformable attention mechanisms to enhance the modeling capability of long-range dependency features and capture more spatial positional information. 2) Content-aware reassembly of features (CARAFE) operator: This operator is used for feature upsampling, aggregating contextual information within a large receptive field, and adaptively recombining features using shallow-level information. 3) High-resolution prediction head: A high-resolution prediction head of size 160x160 is added to map the pixels of input features to finer detection regions, further improving the detection performance of small objects. Experimental results on the FLIR dataset demonstrate that the proposed algorithm achieves an average precision mean of 85.6%, representing a 3.9% improvement over the YOLOX-s algorithm. These results validate the superiority of the proposed algorithm in detecting small objects in infrared images.
[1] |
楼哲航, 罗素云. 基于YOLOX和Swin Transformer的车载红外目标检测[J]. 红外技术, 2022, 44(11): 1167-1175. http://hwjs.nvir.cn/cn/article/id/3d31e429-9365-4797-ab65-60e06a4414d8
LOU Zhehang, LUO Suyun. Vehicle infrared target detection based on YOLOX and swin transformer[J]. Infrared Technology, 2022, 44(11): 1167-1175. http://hwjs.nvir.cn/cn/article/id/3d31e429-9365-4797-ab65-60e06a4414d8
|
[2] |
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60: 91-110. DOI: 10.1023/B:VISI.0000029664.99615.94
|
[3] |
Viola P, Jones M. Rapid object detection using a boosted cascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, 1: I-I. DOI: 10.1109/CVPR.2001.990517.
|
[4] |
Pal M, Foody G M. Feature selection for classification of hyperspectral data by SVM[J]. IEEE Transactions on Geoscience and Remote Sensing, 2010, 48(5): 2297-2307. DOI: 10.1109/TGRS.2009.2039484
|
[5] |
杜妮妮, 单凯东, 卫莎莎. LPformer: 基于拉普拉斯金字塔多级Transformer的红外小目标检测[J]. 红外技术, 2023, 45(6): 630-638. http://hwjs.nvir.cn/cn/article/id/ad309416-52b1-456f-b972-42f94c2aa3e1
DU Nini, SHAN Kaidong, WEI Shasha. LPformer: aplacian pyramid multi-level transformer or infrared small target detection[J]. Infrared Technology, 2023, 45(6): 630-638. http://hwjs.nvir.cn/cn/article/id/ad309416-52b1-456f-b972-42f94c2aa3e1
|
[6] |
武连全, 楚宪腾, 杨海涛, 等. 基于改进YOLOX的X射线违禁物品检测[J]. 红外技术, 2023, 45(4): 427-435. http://hwjs.nvir.cn/cn/article/id/7e45bcc9-aca9-49c9-8f88-0d8c22e5c7de
WU Lianquan, CHU Xianteng, YANG Haitao, et al. X-ray detection of prohibited items based on improved YOLOX[J]. Infrared Technology, 2023, 45(4): 427-435. http://hwjs.nvir.cn/cn/article/id/7e45bcc9-aca9-49c9-8f88-0d8c22e5c7de
|
[7] |
苏海锋, 赵岩, 武泽君, 等. 基于改进RetinaNet的电力设备红外目标精细化检测模型[J]. 红外技术, 2021, 43(11): 1104-1111. http://hwjs.nvir.cn/cn/article/id/3233a6a1-cbf0-4110-baa5-2a56e551f092
SU Haifeng, ZHAO Yan, WU Zejun, et al. Refined infrared object detection model for power equipment based on improved RetinaNet[J]. Infrared Technology, 2021, 43(11): 1104-1111. http://hwjs.nvir.cn/cn/article/id/3233a6a1-cbf0-4110-baa5-2a56e551f092
|
[8] |
徐微, 汤俊伟, 张驰. 基于RA-UNet++的肝癌图像分割方法[J/OL]. 软件导刊: 1-6, [2023-06-28]. http://kns.cnki.net/kcms/detail/42.1671.TP.20230625.2233.048.html.
XU Wei, TANG Junwei, ZHANG Chi. Image segmentation method of liver cancer based on RA-UNet++ Network[J/OL]. Software Guide: 1-6, [2023-06-28]. http://kns.cnki.net/kcms/detail/42.1671.TP.20230625.2233.048.html
|
[9] |
刘伟光, 孔令军. 一种基于TransUnet的臂丛神经超声图像分割网络[J/OL]. 无线电通信技术: 1-8. [2023-06-28]. http://kns.cnki.net/kcms/detail/13.1099.TN.20230625.1719.020.html.
LIU Weiguang, KONG Lingjun. A brachial plexus nerve ultrasonography segmentation network based on TransUnet[J/OL]. Radio Communications Technology: 1-8. [2023-06-28]. http://kns.cnki.net/kcms/detail/13.1099.TN.20230625.1719.020.html
|
[10] |
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
|
[11] |
Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
|
[12] |
REN S Q, HE K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031
|
[13] |
Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
|
[14] |
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517-6525.
|
[15] |
Redmon J, Farhadi A. Yolov3: An incremental improvement[J/OL]. arXiv preprint arXiv: 1804.02767, https://arxiv.org/abs/1804.02767.
|
[16] |
LIU W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Computer Vision-ECCV Proceedings, 2016: 21-37.
|
[17] |
LIN T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
|
[18] |
李强龙, 周新文, 位梦恩, 等. 基于条形池化和注意力机制的街道场景红外目标检测算法[J/OL]. 计算机工程: 1-13, [2023-05-20]. DOI: 10.19678/j.issn.1000-3428.0065481.
LI Qianglong, ZHOU Xinwen, WEI Meng'en, et al. Infrared target detection algorithm based on strip pooling and attention mechanism in street scene[J/OL]. Computer Engineering: 1-13, [2023-05-20]. DOI: 10.19678/j.issn.1000-3428.0065481.
|
[19] |
蒋昕昊, 蔡伟, 杨志勇, 等. 基于YOLO-IDSTD算法的红外弱小目标检测[J]. 红外与激光工程, 2022, 51(3): 502-511. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202203045.htm
JIANG Xinhao, CAI Wei, YANG Zhiyong, et al. Infrared dim and small target detection based on YOLO-IDSTD algorithm[J]. Infrared and Laser Engineering, 2022, 51(3): 502-511. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ202203045.htm
|
[20] |
蔡伟, 徐佩伟, 杨志勇, 等. 复杂背景下红外图像弱小目标检测[J]. 应用光学, 2021, 42(4): 643-650. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202104012.htm
CAI Wei, XU Peiwei, YANG Zhiyong, et al. Dim-small targets detection of infrared images in complex backgrounds[J]. Journal of Applied Optics, 2021, 42(4): 643-650. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202104012.htm
|
[21] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J/OL]. Advances in Neural Information Processing Systems, 2017, https://arxiv.org/abs/1706.03762.
|
[22] |
LIU Z, LIN Y, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 10012-10022.
|
[23] |
GE Zheng, LIU Songtao, WANG Feng, et al. Yolox: exceeding YOLO series in 2021[EB/OL]. (2021-07-06) [2023-09-27]. https://arxiv.org/abs/2107.08430.
|
[24] |
WANG J, CHEN K, XU R, et al. Carafe: Content-aware reassembly of features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 3007-3016.
|
[25] |
WANG W, XIE E, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 568-578.
|
[26] |
XIA Z, PAN X, SONG S, et al. Vision transformer with deformable attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 4794-4803.
|
[27] |
秦鹏, 唐川明, 刘云峰, 等. 基于改进YOLOv3的红外目标检测方法[J]. 计算机工程, 2022, 48(3): 211-219. DOI: 10.19678/j.issn.1000-3428.0060518.
QIN Peng, TANG Chuanming, LIU Yunfeng, et al. Infrared target detection method based on improved YOLOv3[J]. Computer Engineering, 2022, 48(3): 211-219. DOI: 10.19678/j.issn.1000-3428.0060518.
|
[28] |
FU H, WANG S, DUAN P, et al. LRAF-Net: long-range attention fusion network for visible-infrared object detection[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023: 1-14. DOI: 10.1109/TNNLS.2023.3266452.
|
[1] | XIAO Nachuan, SUN Tuo, HU Liyun, ZHAO Yongquan, WANG Shuangbao, XU Zhimou, ZHANG Xueming. Design of Compact Athermalized Long-Wave Infrared Lens Set with Large Field of View[J]. Infrared Technology , 2024, 46(1): 20-26. |
[2] | FENG Lijun, LI Xunniu, CHEN Jie, ZHOU Lingling, DONG Jiangtao, SUN Aiping, BAO Jianan. Design of Long-wavelength Infrared Athermalization Lens with Large Relative Aperture for Large-array Detectors[J]. Infrared Technology , 2022, 44(10): 1066-1072. |
[3] | CHEN Xiao. Athermalization of Infrared Zoom Optical System with Large Relative Aperture[J]. Infrared Technology , 2021, 43(12): 1183-1187. |
[4] | HE Xiangqing, LIAO Xiaojun, DUAN Yuan, ZHANG Haoye. Common Aperture and Athermalization Design of Compact Laser/Infrared Optical System[J]. Infrared Technology , 2020, 42(5): 461-467. |
[5] | YANG Liangliang, SHEN Fahua, LIU Chenglin, TONG Qiaoying. Athermal Design of Infrared Dual-band Optical System with Double-layer Diffractive Optical Elements[J]. Infrared Technology , 2019, 41(8): 699-704. |
[6] | Design of Long-wavelength Infrared Athermalization Lens for Large-array Detector[J]. Infrared Technology , 2018, 40(11): 1061-1064. |
[7] | JIANG Bo, WU Yue-hao, DAI Shi-xun, NIE Qiu-hua, MU Rui, ZHANG Qin-yuan. Design of a Compact Dual-band Athermalized Infrared System[J]. Infrared Technology , 2015, (12): 999-1004. |
[8] | LV Yin-huan, LEI Cun-dong, CUI Wei-xin. Design and Realization of Athermalizing Optical System for Long-wave Infrared Horizon Sensor[J]. Infrared Technology , 2011, 33(11): 651-654,658. DOI: 10.3969/j.issn.1001-8891.2011.11.007 |
[9] | CUI Li, ZHAO Xin-liang, LITong-hai, TIAN Hai-xia, WU Hai-qing. Athermalization of Uncooled Infrared Optical System Without Focusing Mechanism[J]. Infrared Technology , 2010, 32(4): 187-190. DOI: 10.3969/j.issn.1001-8891.2010.04.001 |
[10] | BAI Yun, YANG Jian-feng, MA Xiao-long, XUE Bin, RUAN Ping, TIAN Hai-xia, WANG Hong-wei, LIANG Shi-tong, LI Xiang-juan. Athermalization of Long-wavelength Infrared Optical System[J]. Infrared Technology , 2008, 30(10): 583-585. DOI: 10.3969/j.issn.1001-8891.2008.10.007 |