一种多分辨率特征提取红外图像语义分割算法

徐慧琳; 赵鑫; 于波; 韦小牙; 胡鹏

一种多分辨率特征提取红外图像语义分割算法

徐慧琳^1,,
赵鑫^{1, 2, ,},
于波¹,
韦小牙¹,
胡鹏^{1, 2}

1.
安徽理工大学人工智能学院, 安徽淮南 232000
2.
安徽理工大学深部煤矿采动响应与灾害防控国家重点实验室, 安徽淮南 232000

基金项目:

安徽省教育厅重点项目 KJ2020A0289

淮南市科技计划项目 2020186

安徽省教育厅重点项目 2022AH050801

安徽理工大学青年教师科学研究基金 13200390

详细信息

作者简介:
徐慧琳（1999-），女，安徽安庆人，硕士研究生，研究方向：目标分割、目标检测。E-mail: 2021201730@aust.edu.cn

通讯作者:
赵鑫（1991-），男，山西运城人，讲师，博士，研究方向：机器视觉。E-mail: zhaoxin@aust.edu.cn

中图分类号: TP391.41
计量
- 文章访问数: 73
- HTML全文浏览量: 29
- PDF下载量: 44
出版历程
- 收稿日期: 2023-08-10
- 修回日期: 2023-09-21
- 网络出版日期: 2024-05-23
- 刊出日期: 2024-05-19

Multi-resolution Feature Extraction Algorithm for Semantic Segmentation of Infrared Images

XU Huilin^1,,
ZHAO Xin^{1, 2, ,},
YU Bo¹,
WEI Xiaoya¹,
HU Peng^{1, 2}

1.
School of Artificial Intelligence, College of Anhui University Of Science and Technology, Huainan 232000, China
2.
State Key Laboratory of Mining Response and Disaster Prevention and Control in Deep Coal Mines, Huainan 232000, China

摘要

摘要:
针对现有图像语义分割算法在对低分辨率红外图像进行分割时存在准确率不高的问题，提出了一种多分辨率特征提取算法。该算法以DeepLabv3+为基准网络，添加了一组对偶分辨率模块，该模块包含低分辨率分支和高分辨率分支，以进一步聚合红外图像特征。低分辨率分支采用GPU友好的注意力模块捕获高层全局上下文信息，同时引入一个多轴门控感知机模块并行提取红外图像局部信息和全局信息；高分辨率分支采用跨分辨率注意力模块将低分辨率分支上学习到的全局特征传播扩散到高分辨率分支上以获取更强的语义信息。实验结果表明，该算法在数据集DNDS和MSRS上的分割精度优于现有语义分割算法，证明了提出算法的有效性。
- 对偶分辨率模块 /
- 语义分割 /
- DeepLabv3+ /
- 红外图像 /
- 注意力模块
Abstract:
A multi-resolution feature extraction convolution neural network is proposed for the problem of inaccurate edge segmentation when existing image semantic segmentation algorithms process low-resolution infrared images. DeepLabv3+ is used as the baseline network and adds a multi-resolution block, which contains both high and low resolution branches, to further aggregate the features in infrared images. In the low-resolution branch, a GPU friendly attention module is used to capture high-level global context information, and a multi-axis-gated multilayer perceptron module is added in this branch to extract the local and global information of infrared images in parallel. In the high resolution branch, the cross-attention module is used to propagate the global features learned on the low resolution branch to the high resolution branch, hence the high resolution branch can obtain stronger semantic information. The experimental results indicate that the segmentation accuracy of the algorithm on the dataset DNDS is better than that of the existing semantic segmentation algorithm, demonstrating the superiority of the proposed method.
- multi resolution block /
- semantic segmentation /
- deepLabv3+ /
- infrared image /
- attention module

HTML全文

图 1 DeepLabv3+网络结构

Figure 1. DeepLabv3+ network structure

下载: 全尺寸图片幻灯片

图 2 ASPP模块网络结构

Figure 2. ASPP network structure

下载: 全尺寸图片幻灯片

图 3 MRFE-CNN网络结构

Figure 3. MRFE-CNN network structure

下载: 全尺寸图片幻灯片

图 4 MRBlock结构

Figure 4. MRBlock structure

下载: 全尺寸图片幻灯片

图 5 多头注意力模块Multi-head EA(a)和GFA(b)

Figure 5. Multi-head EA (a) and GFA (b)

下载: 全尺寸图片幻灯片

图 6 MAGBlock结构

Figure 6. MAGBlock structure

下载: 全尺寸图片幻灯片

图 7 DNDS数据集部分图片展示。(a)为红外图像，(b)为真实语义标签

Figure 7. Some pictures of the DNDS. (a) infrared image, (b) real semantic label

下载: 全尺寸图片幻灯片

图 8 DNDS数据集各类标签数量

Figure 8. The number of labels in the DNDS

下载: 全尺寸图片幻灯片

图 9 训练指标比较。(a)LOSS变化曲线图，(b)MIOU变化曲线图

Figure 9. Comparison of training indicators. (a) LOSS curves graph, (b) MIOU curves graph

下载: 全尺寸图片幻灯片

图 10 测试集结果比较。(a)原图；(b)DeepLabv3+；(c)MRFE-CNN；(d)真实标签

Figure 10. Comparison of results. (a) Original image, (b) DeepLabv3+, (c) MRFE-CNN, (d) Real label

下载: 全尺寸图片幻灯片

图 11 训练指标比较。(a)LOSS变化曲线图，(b)MIOU变化曲线图

Figure 11. Comparison of training indicators. (a) LOSS curves graph, (b) MIOU curves graph

下载: 全尺寸图片幻灯片

图 12 测试集结果比较。(a)原图；(b)DeepLabv3+；(c)MRFE-CNN；(d)真实标签

Figure 12. Comparison of results. (a) Original image, (b) DeepLabv3+, (c) MRFE-CNN, (d) Real label

下载: 全尺寸图片幻灯片

表 1 实验硬件配置

Table 1 Experimental hardware configuration

CPU	GPU	Memory	System
2.60GHz Intel Xeon Platinum 8350C CPU	NVIDIA RTX 3090	24GB	Linux

下载: 导出CSV

表 2 MRBlock模块消融实验

Table 2 Ablation experiment of MRBlock

Models	Residual Block	Basic Block	MAGBlock	ASPP	Input_feature	MPA%	MIOU%
Model_1	√	-	-	-	x_l	90.8	81.07
Model_2	√	√	-	-	x_l+x_h	91.13	81.7
Model_3	√	√	*1	-	x_l+x_h	91.91	82.6
Model_4	√	√	*2	-	x_l+x_h	92.36	83.43
Model_6	√	√	*2	√	x_l+x_h	92.72	84.1
Model_7	√	√	*2	√	x_l	92.31	82.1
Model_8	√	√	*2	√	x_h	91.1	83.7

下载: 导出CSV

表 3 流行算法性能比较

Table 3 Performance comparison of popular algorithms

Methods	MPA%	MIOU%
FCN-8s	8.16	5.74
FCN-16s	63.32	48.3
FCN-32s	58.37	42.32
U-Net	78.12	72.35
DUC	78.35	69.61
DeepLabv3+	90.4	82.3
MRFE-CNN	92.72(+2.32)	84.1(+1.8)

下载: 导出CSV

表 4 公共数据集MSRS性能比较

Table 4 Performance comparison of MSRS

Methods	MPA%	MIOU%
DeepLabv3+	68.43	56.37
MRFE-CNN	71.82(+3.39)	58.14(+1.77)

下载: 导出CSV

参考文献(24)

[1]	刘致驿, 孙韶媛, 任正云, 等. 基于改进DeepLabv3+的无人车夜间红外图像语义分割[J]. 应用光学, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm LIU Zhiyi, SUN Shaoyuan, REN Zhengyun, et al. Semantic segmentation of nocturnal infrared images of unmanned vehicles based on improved DeepLabv3+[J]. Journal of Applied Optics, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm
[2]	夏威. 基于卷积神经网络的热红外图像语义分割研究[D]. 合肥: 安徽大学, 2020. XIA Wei. Thermal Image Semantic Segmentation Based on Convolutional Neural Networks[D]. Hefei: Anhui University, 2020.
[3]	景庄伟, 管海燕, 彭代峰, 等. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm JING Zhuangwei, GUAN Haiyan, PENG Daifeng, et al. Survey of research in image semantic segmentation based on deep neural network[J] Computer Engineering, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm
[4]	ZHAO L, WANG M, YUE Y. Sem-aug: improving camera-lidar feature fusion with semantic augmentation for 3d vehicle detection[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 9358-9365. DOI: 10.1109/LRA.2022.3191208
[5]	WANG J, LIU L, LU M, et al. The estimation of broiler respiration rate based on the semantic segmentation and video amplification[J]. Frontiers in Physics, 2022, 10: 1-13.
[6]	XUE Z, MAO W, ZHENG L. Learning to simulate complex scenes for street scene segmentation[J]. IEEE Transactions on Multimedia, 2021, 24: 1253-1265.
[7]	WANG Y, TIAN S, YU L, et al. FSOU-Net: Feature supplement and optimization U-Net for 2D medical image segmentation[J]. Technology and Health Care, 2023, 31(1): 181-195. DOI: 10.3233/THC-220174
[8]	郭尹. 基于深度学习的电力设备热红外图像语义分割方法研究[D]. 合肥: 安徽大学, 2022. GUO Yin. Research on Electrical Thermal Image Semantic Segmentation Method Based on Deep Learning[D]. Hefei: Anhui University, 2022.
[9]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
[10]	Adrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615
[11]	OLAF R, PHILIPP F, THOMAS B. U-Net: Convolutional networks for biomedical image segmentation[J]. CoRR, 2015, abs/1505.04597.
[12]	ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017: 2881-2890.
[13]	LIN G, MILAN A, SHEN C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1925-1934.
[14]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848.
[15]	CHEN L C, ZHU Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 801-818.
[16]	Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.
[17]	练琤, 张宝辉, 江云峰, 等. 基于语义分割的红外图像增强方法[J]. 红外技术, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML LIAN Zheng, ZHANG Baohui, JIANG Yunfeng, et al. An infrared image enhancement method based on semantic segmentation[J]. Infrared Technology, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML
[18]	WANG J, GOU C, WU Q, et al. RTFormer: efficient design for real-time semantic segmentation with transformer[J]. arXiv e-prints, 2022: arXiv: 2210.07124.
[19]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv, 2017. DOI: 10.48550/arXiv.1706.03762.
[20]	TU Z, TALEBI H, ZHANG H, et al. Maxim: Multi-axis MLP for image processing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5769-5780.
[21]	JADON S. A survey of loss functions for semantic segmentation[C]//IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2020: 1-7.
[22]	Sandler M, Howard A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
[23]	于营, 王春平, 付强, 等. 语义分割评价指标和评价方法综述[J]. 计算机工程与应用, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm YU Ying, WANG Chunping, FU Qiang, et al. Survey of evaluation metrics and methods for semantic segmentation[J]. Journal of Computer Engineering & Applications, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm
[24]	WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018: 1451-1460.