Multi-resolution Feature Extraction Algorithm for Semantic Segmentation of Infrared Images
-
摘要:
针对现有图像语义分割算法在对低分辨率红外图像进行分割时存在准确率不高的问题,提出了一种多分辨率特征提取算法。该算法以DeepLabv3+为基准网络,添加了一组对偶分辨率模块,该模块包含低分辨率分支和高分辨率分支,以进一步聚合红外图像特征。低分辨率分支采用GPU友好的注意力模块捕获高层全局上下文信息,同时引入一个多轴门控感知机模块并行提取红外图像局部信息和全局信息;高分辨率分支采用跨分辨率注意力模块将低分辨率分支上学习到的全局特征传播扩散到高分辨率分支上以获取更强的语义信息。实验结果表明,该算法在数据集DNDS和MSRS上的分割精度优于现有语义分割算法,证明了提出算法的有效性。
-
关键词:
- 对偶分辨率模块 /
- 语义分割 /
- DeepLabv3+ /
- 红外图像 /
- 注意力模块
Abstract:A multi-resolution feature extraction convolution neural network is proposed for the problem of inaccurate edge segmentation when existing image semantic segmentation algorithms process low-resolution infrared images. DeepLabv3+ is used as the baseline network and adds a multi-resolution block, which contains both high and low resolution branches, to further aggregate the features in infrared images. In the low-resolution branch, a GPU friendly attention module is used to capture high-level global context information, and a multi-axis-gated multilayer perceptron module is added in this branch to extract the local and global information of infrared images in parallel. In the high resolution branch, the cross-attention module is used to propagate the global features learned on the low resolution branch to the high resolution branch, hence the high resolution branch can obtain stronger semantic information. The experimental results indicate that the segmentation accuracy of the algorithm on the dataset DNDS is better than that of the existing semantic segmentation algorithm, demonstrating the superiority of the proposed method.
-
Keywords:
- multi resolution block /
- semantic segmentation /
- deepLabv3+ /
- infrared image /
- attention module
-
-
表 1 实验硬件配置
Table 1 Experimental hardware configuration
CPU GPU Memory System 2.60GHz Intel Xeon Platinum 8350C CPU NVIDIA RTX 3090 24GB Linux 表 2 MRBlock模块消融实验
Table 2 Ablation experiment of MRBlock
Models Residual Block Basic Block MAGBlock ASPP Input_feature MPA% MIOU% Model_1 √ - - - x_l 90.8 81.07 Model_2 √ √ - - x_l+x_h 91.13 81.7 Model_3 √ √ *1 - x_l+x_h 91.91 82.6 Model_4 √ √ *2 - x_l+x_h 92.36 83.43 Model_6 √ √ *2 √ x_l+x_h 92.72 84.1 Model_7 √ √ *2 √ x_l 92.31 82.1 Model_8 √ √ *2 √ x_h 91.1 83.7 表 3 流行算法性能比较
Table 3 Performance comparison of popular algorithms
Methods MPA% MIOU% FCN-8s 8.16 5.74 FCN-16s 63.32 48.3 FCN-32s 58.37 42.32 U-Net 78.12 72.35 DUC 78.35 69.61 DeepLabv3+ 90.4 82.3 MRFE-CNN 92.72(+2.32) 84.1(+1.8) 表 4 公共数据集MSRS性能比较
Table 4 Performance comparison of MSRS
Methods MPA% MIOU% DeepLabv3+ 68.43 56.37 MRFE-CNN 71.82(+3.39) 58.14(+1.77) -
[1] 刘致驿, 孙韶媛, 任正云, 等. 基于改进DeepLabv3+的无人车夜间红外图像语义分割[J]. 应用光学, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm LIU Zhiyi, SUN Shaoyuan, REN Zhengyun, et al. Semantic segmentation of nocturnal infrared images of unmanned vehicles based on improved DeepLabv3+[J]. Journal of Applied Optics, 2020, 41(1): 180-185. https://www.cnki.com.cn/Article/CJFDTOTAL-YYGX202001031.htm
[2] 夏威. 基于卷积神经网络的热红外图像语义分割研究[D]. 合肥: 安徽大学, 2020. XIA Wei. Thermal Image Semantic Segmentation Based on Convolutional Neural Networks[D]. Hefei: Anhui University, 2020.
[3] 景庄伟, 管海燕, 彭代峰, 等. 基于深度神经网络的图像语义分割研究综述[J]. 计算机工程, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm JING Zhuangwei, GUAN Haiyan, PENG Daifeng, et al. Survey of research in image semantic segmentation based on deep neural network[J] Computer Engineering, 2020, 46(10): 1-17. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202208003.htm
[4] ZHAO L, WANG M, YUE Y. Sem-aug: improving camera-lidar feature fusion with semantic augmentation for 3d vehicle detection[J]. IEEE Robotics and Automation Letters, 2022, 7(4): 9358-9365. DOI: 10.1109/LRA.2022.3191208
[5] WANG J, LIU L, LU M, et al. The estimation of broiler respiration rate based on the semantic segmentation and video amplification[J]. Frontiers in Physics, 2022, 10: 1-13.
[6] XUE Z, MAO W, ZHENG L. Learning to simulate complex scenes for street scene segmentation[J]. IEEE Transactions on Multimedia, 2021, 24: 1253-1265.
[7] WANG Y, TIAN S, YU L, et al. FSOU-Net: Feature supplement and optimization U-Net for 2D medical image segmentation[J]. Technology and Health Care, 2023, 31(1): 181-195. DOI: 10.3233/THC-220174
[8] 郭尹. 基于深度学习的电力设备热红外图像语义分割方法研究[D]. 合肥: 安徽大学, 2022. GUO Yin. Research on Electrical Thermal Image Semantic Segmentation Method Based on Deep Learning[D]. Hefei: Anhui University, 2022.
[9] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
[10] Adrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. DOI: 10.1109/TPAMI.2016.2644615
[11] OLAF R, PHILIPP F, THOMAS B. U-Net: Convolutional networks for biomedical image segmentation[J]. CoRR, 2015, abs/1505.04597.
[12] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017: 2881-2890.
[13] LIN G, MILAN A, SHEN C, et al. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1925-1934.
[14] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848.
[15] CHEN L C, ZHU Y, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 801-818.
[16] Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.
[17] 练琤, 张宝辉, 江云峰, 等. 基于语义分割的红外图像增强方法[J]. 红外技术, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML LIAN Zheng, ZHANG Baohui, JIANG Yunfeng, et al. An infrared image enhancement method based on semantic segmentation[J]. Infrared Technology, 2023, 45(4): 394-401. http://hwjs.nvir.cn/cn/article/id/012a14e0-e0f5-4854-94fa-7b0392f63498?viewType=HTML
[18] WANG J, GOU C, WU Q, et al. RTFormer: efficient design for real-time semantic segmentation with transformer[J]. arXiv e-prints, 2022: arXiv: 2210.07124.
[19] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv, 2017. DOI: 10.48550/arXiv.1706.03762.
[20] TU Z, TALEBI H, ZHANG H, et al. Maxim: Multi-axis MLP for image processing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5769-5780.
[21] JADON S. A survey of loss functions for semantic segmentation[C]//IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2020: 1-7.
[22] Sandler M, Howard A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
[23] 于营, 王春平, 付强, 等. 语义分割评价指标和评价方法综述[J]. 计算机工程与应用, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm YU Ying, WANG Chunping, FU Qiang, et al. Survey of evaluation metrics and methods for semantic segmentation[J]. Journal of Computer Engineering & Applications, 2023, 59(6): 13. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202306005.htm
[24] WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018: 1451-1460.
-
期刊类型引用(1)
1. 褚萌. 累计降水量质控方法设计及应用. 中国新技术新产品. 2024(17): 136-138+142 . 百度学术
其他类型引用(0)