一种多分辨率特征提取红外图像语义分割算法

徐慧琳, 赵鑫, 于波, 韦小牙, 胡鹏

徐慧琳, 赵鑫, 于波, 韦小牙, 胡鹏. 一种多分辨率特征提取红外图像语义分割算法[J]. 红外技术, 2024, 46(5): 556-564.
引用本文: 徐慧琳, 赵鑫, 于波, 韦小牙, 胡鹏. 一种多分辨率特征提取红外图像语义分割算法[J]. 红外技术, 2024, 46(5): 556-564.
XU Huilin, ZHAO Xin, YU Bo, WEI Xiaoya, HU Peng. Multi-resolution Feature Extraction Algorithm for Semantic Segmentation of Infrared Images[J]. Infrared Technology , 2024, 46(5): 556-564.
Citation: XU Huilin, ZHAO Xin, YU Bo, WEI Xiaoya, HU Peng. Multi-resolution Feature Extraction Algorithm for Semantic Segmentation of Infrared Images[J]. Infrared Technology , 2024, 46(5): 556-564.

一种多分辨率特征提取红外图像语义分割算法

基金项目: 

安徽省教育厅重点项目 KJ2020A0289

淮南市科技计划项目 2020186

安徽省教育厅重点项目 2022AH050801

安徽理工大学青年教师科学研究基金 13200390

详细信息
    作者简介:

    徐慧琳(1999-),女,安徽安庆人,硕士研究生,研究方向:目标分割、目标检测。E-mail: 2021201730@aust.edu.cn

    通讯作者:

    赵鑫(1991-),男,山西运城人,讲师,博士,研究方向:机器视觉。E-mail: zhaoxin@aust.edu.cn

  • 中图分类号: TP391.41

Multi-resolution Feature Extraction Algorithm for Semantic Segmentation of Infrared Images

  • 摘要:

    针对现有图像语义分割算法在对低分辨率红外图像进行分割时存在准确率不高的问题,提出了一种多分辨率特征提取算法。该算法以DeepLabv3+为基准网络,添加了一组对偶分辨率模块,该模块包含低分辨率分支和高分辨率分支,以进一步聚合红外图像特征。低分辨率分支采用GPU友好的注意力模块捕获高层全局上下文信息,同时引入一个多轴门控感知机模块并行提取红外图像局部信息和全局信息;高分辨率分支采用跨分辨率注意力模块将低分辨率分支上学习到的全局特征传播扩散到高分辨率分支上以获取更强的语义信息。实验结果表明,该算法在数据集DNDS和MSRS上的分割精度优于现有语义分割算法,证明了提出算法的有效性。

    Abstract:

    A multi-resolution feature extraction convolution neural network is proposed for the problem of inaccurate edge segmentation when existing image semantic segmentation algorithms process low-resolution infrared images. DeepLabv3+ is used as the baseline network and adds a multi-resolution block, which contains both high and low resolution branches, to further aggregate the features in infrared images. In the low-resolution branch, a GPU friendly attention module is used to capture high-level global context information, and a multi-axis-gated multilayer perceptron module is added in this branch to extract the local and global information of infrared images in parallel. In the high resolution branch, the cross-attention module is used to propagate the global features learned on the low resolution branch to the high resolution branch, hence the high resolution branch can obtain stronger semantic information. The experimental results indicate that the segmentation accuracy of the algorithm on the dataset DNDS is better than that of the existing semantic segmentation algorithm, demonstrating the superiority of the proposed method.

  • 图  1   DeepLabv3+网络结构

    Figure  1.   DeepLabv3+ network structure

    图  2   ASPP模块网络结构

    Figure  2.   ASPP network structure

    图  3   MRFE-CNN网络结构

    Figure  3.   MRFE-CNN network structure

    图  4   MRBlock结构

    Figure  4.   MRBlock structure

    图  5   多头注意力模块Multi-head EA(a)和GFA(b)

    Figure  5.   Multi-head EA (a) and GFA (b)

    图  6   MAGBlock结构

    Figure  6.   MAGBlock structure

    图  7   DNDS数据集部分图片展示。(a)为红外图像,(b)为真实语义标签

    Figure  7.   Some pictures of the DNDS. (a) infrared image, (b) real semantic label

    图  8   DNDS数据集各类标签数量

    Figure  8.   The number of labels in the DNDS

    图  9   训练指标比较。(a)LOSS变化曲线图,(b)MIOU变化曲线图

    Figure  9.   Comparison of training indicators. (a) LOSS curves graph, (b) MIOU curves graph

    图  10   测试集结果比较。(a)原图;(b)DeepLabv3+;(c)MRFE-CNN;(d)真实标签

    Figure  10.   Comparison of results. (a) Original image, (b) DeepLabv3+, (c) MRFE-CNN, (d) Real label

    图  11   训练指标比较。(a)LOSS变化曲线图,(b)MIOU变化曲线图

    Figure  11.   Comparison of training indicators. (a) LOSS curves graph, (b) MIOU curves graph

    图  12   测试集结果比较。(a)原图;(b)DeepLabv3+;(c)MRFE-CNN;(d)真实标签

    Figure  12.   Comparison of results. (a) Original image, (b) DeepLabv3+, (c) MRFE-CNN, (d) Real label

    表  1   实验硬件配置

    Table  1   Experimental hardware configuration

    CPU GPU Memory System
    2.60GHz Intel Xeon Platinum 8350C CPU NVIDIA RTX 3090 24GB Linux
    下载: 导出CSV

    表  2   MRBlock模块消融实验

    Table  2   Ablation experiment of MRBlock

    Models Residual Block Basic Block MAGBlock ASPP Input_feature MPA% MIOU%
    Model_1 - - - x_l 90.8 81.07
    Model_2 - - x_l+x_h 91.13 81.7
    Model_3 *1 - x_l+x_h 91.91 82.6
    Model_4 *2 - x_l+x_h 92.36 83.43
    Model_6 *2 x_l+x_h 92.72 84.1
    Model_7 *2 x_l 92.31 82.1
    Model_8 *2 x_h 91.1 83.7
    下载: 导出CSV

    表  3   流行算法性能比较

    Table  3   Performance comparison of popular algorithms

    Methods MPA% MIOU%
    FCN-8s 8.16 5.74
    FCN-16s 63.32 48.3
    FCN-32s 58.37 42.32
    U-Net 78.12 72.35
    DUC 78.35 69.61
    DeepLabv3+ 90.4 82.3
    MRFE-CNN 92.72(+2.32) 84.1(+1.8)
    下载: 导出CSV

    表  4   公共数据集MSRS性能比较

    Table  4   Performance comparison of MSRS

    Methods MPA% MIOU%
    DeepLabv3+ 68.43 56.37
    MRFE-CNN 71.82(+3.39) 58.14(+1.77)
    下载: 导出CSV
  • [1] 刘致驿, 孙韶媛, 任正云, 等. 基于改进DeepLabv3+的无人车夜间红外图像语义分割[J]. 应用光学, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm

    LIU Zhiyi, SUN Shaoyuan, REN Zhengyun, et al. Semantic segmentation of nocturnal infrared images of unmanned vehicles based on improved DeepLabv3+[J]. Journal of Applied Optics, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm

    [2] 夏威. 基于卷积神经网络的热红外图像语义分割研究[D]. 合肥: 安徽大学, 2020.

    XIA Wei. Thermal Image Semantic Segmentation Based on Convolutional Neural Networks[D]. Hefei: Anhui University, 2020.

    [3] 景庄伟, 管海燕, 彭代峰, 等. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm

    JING Zhuangwei, GUAN Haiyan, PENG Daifeng, et al. Survey of research in image semantic segmentation based on deep neural network[J] Computer Engineering, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm

    [4]

    ZHAO L, WANG M, YUE Y. Sem-aug: improving camera-lidar feature fusion with semantic augmentation for 3d vehicle detection[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 9358-9365. DOI: 10.1109/LRA.2022.3191208

    [5]

    WANG J, LIU L, LU M, et al. The estimation of broiler respiration rate based on the semantic segmentation and video amplification[J]. Frontiers in Physics, 2022, 10: 1-13.

    [6]

    XUE Z, MAO W, ZHENG L. Learning to simulate complex scenes for street scene segmentation[J]. IEEE Transactions on Multimedia, 2021, 24: 1253-1265.

    [7]

    WANG Y, TIAN S, YU L, et al. FSOU-Net: Feature supplement and optimization U-Net for 2D medical image segmentation[J]. Technology and Health Care, 2023, 31(1): 181-195. DOI: 10.3233/THC-220174

    [8] 郭尹. 基于深度学习的电力设备热红外图像语义分割方法研究[D]. 合肥: 安徽大学, 2022.

    GUO Yin. Research on Electrical Thermal Image Semantic Segmentation Method Based on Deep Learning[D]. Hefei: Anhui University, 2022.

    [9]

    LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.

    [10]

    Adrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615

    [11]

    OLAF R, PHILIPP F, THOMAS B. U-Net: Convolutional networks for biomedical image segmentation[J]. CoRR, 2015, abs/1505.04597.

    [12]

    ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017: 2881-2890.

    [13]

    LIN G, MILAN A, SHEN C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1925-1934.

    [14]

    CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848.

    [15]

    CHEN L C, ZHU Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 801-818.

    [16]

    Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.

    [17] 练琤, 张宝辉, 江云峰, 等. 基于语义分割的红外图像增强方法[J]. 红外技术, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML

    LIAN Zheng, ZHANG Baohui, JIANG Yunfeng, et al. An infrared image enhancement method based on semantic segmentation[J]. Infrared Technology, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML

    [18]

    WANG J, GOU C, WU Q, et al. RTFormer: efficient design for real-time semantic segmentation with transformer[J]. arXiv e-prints, 2022: arXiv: 2210.07124.

    [19]

    VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv, 2017. DOI: 10.48550/arXiv.1706.03762.

    [20]

    TU Z, TALEBI H, ZHANG H, et al. Maxim: Multi-axis MLP for image processing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5769-5780.

    [21]

    JADON S. A survey of loss functions for semantic segmentation[C]//IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2020: 1-7.

    [22]

    Sandler M, Howard A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.

    [23] 于营, 王春平, 付强, 等. 语义分割评价指标和评价方法综述[J]. 计算机工程与应用, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm

    YU Ying, WANG Chunping, FU Qiang, et al. Survey of evaluation metrics and methods for semantic segmentation[J]. Journal of Computer Engineering & Applications, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm

    [24]

    WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018: 1451-1460.

  • 期刊类型引用(1)

    1. 褚萌. 累计降水量质控方法设计及应用. 中国新技术新产品. 2024(17): 136-138+142 . 百度学术

    其他类型引用(0)

图(12)  /  表(4)
计量
  • 文章访问数:  73
  • HTML全文浏览量:  29
  • PDF下载量:  44
  • 被引次数: 1
出版历程
  • 收稿日期:  2023-08-10
  • 修回日期:  2023-09-21
  • 网络出版日期:  2024-05-23
  • 刊出日期:  2024-05-19

目录

    /

    返回文章
    返回