留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于YOLOX和Swin Transformer的车载红外目标检测

楼哲航 罗素云

楼哲航, 罗素云. 基于YOLOX和Swin Transformer的车载红外目标检测[J]. 红外技术, 2022, 44(11): 1167-1175.
引用本文: 楼哲航, 罗素云. 基于YOLOX和Swin Transformer的车载红外目标检测[J]. 红外技术, 2022, 44(11): 1167-1175.
LOU Zhehang, LUO Suyun. Vehicle Infrared Target Detection Based on YOLOX and Swin Transformer[J]. Infrared Technology , 2022, 44(11): 1167-1175.
Citation: LOU Zhehang, LUO Suyun. Vehicle Infrared Target Detection Based on YOLOX and Swin Transformer[J]. Infrared Technology , 2022, 44(11): 1167-1175.

基于YOLOX和Swin Transformer的车载红外目标检测

详细信息
    作者简介:

    楼哲航(1999-),男,硕士研究生,主要从事无人驾驶车辆环境感知方向的研究。E-mail:15968194691@163.com

    通讯作者:

    罗素云(1975-),女,副教授,主要从事无人驾驶汽车环境感知及控制的研究。E-mail:lsyluo@163.com

  • 中图分类号: TP391.4

Vehicle Infrared Target Detection Based on YOLOX and Swin Transformer

  • 摘要: 红外图像因为存在噪声大、对比度不佳等问题,容易导致目标检测时的精度降低,本文结合YOLOX和Swin Transformer,提出了一种改进的YOLOX的模型。改进的模型采用Swin Transformer替换YOLOX中的CSPDarknet主干提取网络,减少YOLOX中Neck和Head部分的激活函数以及标准化层,以提高特征的提取能力,优化网络结构。对改进的模型在艾瑞光电数据集和FILR数据集上均进行了测试,实验结果显示,改进后的YOLOX网络,在两个数据集上的平均检测精度都有明显提升,更加适合红外图像的目标检测。
  • 图  1  YOLOX的Backbone

    Figure  1.  YOlOLX's Backbone

    图  2  YOLOX的Head部分

    Figure  2.  YOlOLX's Head

    图  3  Transformer组件

    Figure  3.  Transformer blocks

    图  4  窗口滑动机制(左:滑动前,右:滑动后)

    Figure  4.  Window sliding mechanism(Left: before sliding, right: after sliding)

    图  5  改进YOLOX模型结构

    Figure  5.  Improve YOLOX model structure

    图  6  优化后的CSP层和Head结构

    Figure  6.  Optimized CSP layer and Head structure

    图  7  扩充后的数据集

    Figure  7.  The augmented dataset

    图  8  原始艾瑞光电数据集上损失函数对比

    Figure  8.  Comparison of loss functions on the original Inf iRay dataset

    图  9  扩充后艾睿光电数据集上损失函数对比

    Figure  9.  Comparison of loss functions on the augmented Inf iRay dataset

    图  10  测试效果(上:原图,中:YOLOX,下:本文所改进的模型)

    Figure  10.  Tested and results show (top: original picture, middle: YOLOX, bottom: improved model)

    图  11  MAP及AP值

    Figure  11.  MAP and AP values

    表  1  不同的Swin Transformer参数

    Table  1.   Different Swin Transformer parameters

    Input dim Head number Block’s number
    of layers
    Swin-T 96 (3, 6, 12, 24) (2, 2, 6, 2)
    Swin-S 96 (3, 6, 12, 24) (2, 2, 18, 2)
    Swin-B 128 (4, 8, 16, 32) (2, 2, 18, 2)
    Swin-L 192 (6, 12, 24, 48) (2, 2, 18, 2)
    下载: 导出CSV

    表  2  主流目标检测对比及消融实验

    Table  2.   Comparison of current target detection method and ablation experiment

    Methods Dataset AP/% MAP/%
    Car Person Bicycle/bus
    Faster R-CNN(VGG16)[20] FILR 74.15 62.14 43.58 59.96
    YOLOV3[20] FILR 77.69 57.47 39.74 58.02
    YOLOV3([20] improved)[20] FILR 81.90 72.60 49.00 66.80
    YOLOV4-Tiny[21] FILR 78.65 61.84 32.85 57.78
    YOLOV4-Tiny([21] improved)[21] FILR 81.89 69.56 42.04 64.50
    YOLOV5-s[22] FILR 87.10 46.80 41.00 58.30
    YOLOV5-s([22] improved)[22] FILR 87.50 53.60 44.10 61.70
    YOLOX original Inf iRay dataset 55.38 11.57 23.04 29.51
    YOLOX augmented Inf iRay dataset 47.32 33.06 38.44 57.34
    YOLOX FILR 84.70 69.45 43.25 65.80
    +Replaced Backbone FILR 87.64 79.23 47.46 71.54
    +Neck and Head optimization FILR 84.32 73.23 44.50 67.35
    Our model original Inf iRay dataset 59.16 19.12 23.43 33.74
    Our model augmented Inf iRay dataset 85.01 66.21 76.40 79.55
    Our model FILR 87.20 82.06 58.83 76.03
    Note: The categories corresponding to the FILR dataset are Car, Person and Bicycle; The corresponding categories of Inf iRay dataset (including before and after expansion) are Car, Person and bus. See Figure 11 for the results of trunk and cyclist categories in the Inf iRay dataset.
    下载: 导出CSV
  • [1] Caniou J. Passive Infrared Detection: Theory and Applications[M]. Springer Science & Business Media, 2013.
    [2] 任章, 李露, 蒋宏. 基于红外图像序列的运动目标检测算法研究[J]. 红外与激光工程, 2007, 36(9): 136-140. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ2007S2032.htm

    REN Zhang, LI Lu, JIANG Hong Research on moving target detection algorithm based on infrared image sequence[J]. Infrared and Laser Engineering, 2007, 36(9): 136-140. https://www.cnki.com.cn/Article/CJFDTOTAL-HWYJ2007S2032.htm
    [3] 吴燕茹, 程咏梅, 赵永强. 利用KPCA特征提取的Adaboost红外标检测[J]. 红外与激光工程, 2011, 40(2): 338-343. doi:  10.3969/j.issn.1007-2276.2011.02.032

    WU Yanru, CHENG Yongmei, ZHAO Yongqiang. Adaboost infrared target detection using KPCA feature extraction[J]. Infrared and Laser Engineering, 2011, 40(2): 338-343. doi:  10.3969/j.issn.1007-2276.2011.02.032
    [4] 陈炳文. 特定视场中红外成像目标检测关键技术研究[D]. 武汉: 武汉大学, 2013.

    CHEN Bingwen. Research on Key Technologies of Infrared Imaging Target Detection in Specific Field of View[D] Wuhan: Wuhan University, 2013.
    [5] James W Davis, Vinay Sharma. Robust background-subtraction for person detection in thermal imagery[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004: 1-8.
    [6] Ei Baf Fida, Bouwmans Thierry, Vachon Bertrand. Fuzzy foreground detection for infrared video[C]//IEEE Computer society Conference on Computer Vision and Pattern Recognition, 2008: 1-6.
    [7] 于杰. 基于红外摄像机的夜间场景监控方法研究与实现[D]. 北京: 北京邮电大学, 2013.

    YU Jie. Research and Implementation of Night Scene Monitoring Method Based on Infrared Camera[D]. Beijing: Beijing University of Posts and Telecommunications, 2013.
    [8] 易诗, 聂焱, 张洋溢, 等. 基于红外热成像与YOLOv3的夜间目标识别方法[J]. 红外技术, 2019, 41(10): 970-975. http://hwjs.nvir.cn/article/id/hwjs201910013

    YI Shi, NIE Yan, ZHANG Yangyi, et al. Night target recognition method based on infrared thermal imaging and YOLOv3[J]. Infrared Technology, 2019, 41(10): 970-975. http://hwjs.nvir.cn/article/id/hwjs201910013
    [9] 聂霆. 基于红外图像的前方车辆识别与车距检测[D]. 西安: 西安电子科技大学, 2015.

    NIE Ting. Forward Vehicle Recognition and Distance Detection Based on Infrared Image[D]. Xi'an: Xi'an University of Electronic Science and Technology, 2015.
    [10] 陈谧. 基于深度学习的红外目标检测方法研究与实现[D]. 成都: 电子科技大学, 2021.

    CHEN Mi. Research and Implementation of Infrared Target Detection Method Based on Depth Learning[D]. Chengdu: University of Electronic Science and Technology, 2021.
    [11] 舒朗, 张智杰, 雷波. 一种针对红外目标检测的Dense-Yolov5算法研究[J]. 光学与光电技术, 2021, 19(1): 69-75. https://www.cnki.com.cn/Article/CJFDTOTAL-GXGD202101010.htm

    SHU Lang, ZHANG Zhijie, LEI Bo. Research on Dense-Yolov5 algorithm for infrared target detection[J]. Optics and Optoelectronics, 2021, 19(1): 69-75. https://www.cnki.com.cn/Article/CJFDTOTAL-GXGD202101010.htm
    [12] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[J/OL]. arXiv Preprint arXiv, 2103.14030.
    [13] GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo seriesin[J/OL]. arXiv Preprint arXiv, 2107.08430.
    [14] Redmon J, Farhadi A. YOLO V3: an incremental improvement[J/OL]. arXiv Preprint arXiv, 1804.02767.
    [15] Bochkovskiy A, WANG C Y, LIAO H M. YOLOv4: optimal speed and accuracy of object detection[J/OL]. arXiv Preprint arXiv, 2004.10934.
    [16] ZHUANG L, Hanzi M, CHAO Yuan W, et al. A ConvNet for the 2020s[J/OL]. arXiv Preprint arXiv, 2201.03545.
    [17] 王周春, 崔文楠, 张涛. 基于支持向量机的长波红外目标分类识别算法[J]. 红外技术, 2021, 43(2): 153-161. http://hwjs.nvir.cn/article/id/73b78f5d-26f5-4da8-8b5c-93aa7a7a40e2

    WANG Zhouchun, CUI Wennan, ZHANG Tao. Long wave infrared target classification and recognition algorithm based on support vector machine [J]. Infrared Technology, 2021, 43(2): 153-161. http://hwjs.nvir.cn/article/id/73b78f5d-26f5-4da8-8b5c-93aa7a7a40e2
    [18] Inf iray. Double light vehicle scene database[EB/OL]. [2022-04-02]. http://iray.iraytek.com:7813/apply/Double_light_vehicle.html/.
    [19] Flir. FLIR Thermal Data Set[EB/OL]. [2022-04-02]. https://www.flir.com/oem/adas/adas-dataset-form/.
    [20] 张汝榛, 张建林, 祁小平, 等. 复杂场景下的红外目标检测[J]. 光电工程, 2020, 47(10): 128-137. https://www.cnki.com.cn/Article/CJFDTOTAL-GDGC202010010.htm

    ZHANG Ruzhen, ZHANG Jianlin, QI Xiaoping, et al. Infrared target detection in complex scenes[J]. Optoelectronic Engineering, 2020, 47(10): 128-137. https://www.cnki.com.cn/Article/CJFDTOTAL-GDGC202010010.htm
    [21] 张鹏辉, 刘志, 郑建勇, 等. 面向嵌入式系统的复杂场景红外目标实时检测算法[J]. 光子学报, 2022, 51(2): 203-212. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202202021.htm

    ZHANG Penghui, LIU Zhi, ZHENG Jianyong, et al. Real time infrared target detection algorithm for embedded systems in complex scenes[J]. Acta Photonica Sinica, 2022, 51(2): 203-212. https://www.cnki.com.cn/Article/CJFDTOTAL-GZXB202202021.htm
    [22] 宋甜, 李颖, 王静. 改进YOLOv5s的车载红外图像目标检测[J]. 现代计算机, 2022, 28(2): 21-28. https://www.cnki.com.cn/Article/CJFDTOTAL-XDJS202202003.htm

    SONG Tian, LI Ying, WANG Jing. Improved vehicle infrared image target detection of YOLOv5s[J]. Modern Computer, 2022, 28(2): 21-28. https://www.cnki.com.cn/Article/CJFDTOTAL-XDJS202202003.htm
  • 加载中
图(11) / 表(2)
计量
  • 文章访问数:  50
  • HTML全文浏览量:  13
  • PDF下载量:  28
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-06-10
  • 修回日期:  2022-08-10
  • 刊出日期:  2022-11-20

目录

    /

    返回文章
    返回