结合Transformer与生成对抗网络的水下图像增强算法

袁红春; 张波; 程心

结合Transformer与生成对抗网络的水下图像增强算法

袁红春¹,
张波^1, ,,
程心²

1.
上海海洋大学信息学院, 上海 201306
2.
上海海洋大学海洋生物资源与管理学院, 上海 201306

基金项目:

国家自然科学基金 41776142

详细信息

作者简介:
袁红春（1971-），男，博士，教授，主要研究方向为智能信息处理

通讯作者:
张波（1996-），男，硕士研究生，主要研究方向为水下图像增强。E-mail：806232525@qq.com

中图分类号: TP391.4
计量
- 文章访问数: 145
- HTML全文浏览量: 7
- PDF下载量: 48
出版历程
- 收稿日期: 2023-04-08
- 修回日期: 2023-05-16
- 刊出日期: 2024-09-19

Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network

1.
School of Information, Shanghai Ocean University, Shanghai 201306, China
2.
School of Marine Living Resource Sciences and Management, Shanghai Ocean University, Shanghai 201306, China

摘要

摘要:
由于水下环境的多样性和光在水中受到的散射及选择性吸收作用，采集到的水下图像通常会产生严重的质量退化问题，如颜色偏差、清晰度低和亮度低等，为解决以上问题，本文提出了一种基于Transformer和生成对抗网络的水下图像增强算法。以生成对抗网络为基础架构，结合编码解码结构、基于空间自注意力机制的全局特征建模Transformer模块和通道级多尺度特征融合Transformer模块构建了TGAN（generative adversarial network with transformer）网络增强模型，重点关注水下图像衰减更严重的颜色通道和空间区域，有效增强了图像细节并解决了颜色偏差问题。此外，设计了一种结合RGB和LAB颜色空间的多项损失函数，约束网络增强模型的对抗训练。实验结果表明，与CLAHE（contrast limited adaptive histogram equalization）、UDCP（underwater dark channel prior）、UWCNN（underwater based on convolutional neural network）、FUnIE-GAN（fast underwater image enhancement for improved visual perception）等典型水下图像增强算法相比，所提算法增强后的水下图像在清晰度、细节纹理和色彩表现等方面都有所提升，客观评价指标如峰值信噪比、结构相似性和水下图像质量度量的平均值分别提升了5.8%、1.8%和3.6%，有效地提升了水下图像的视觉感知效果。
- 图像处理 /
- 水下图像增强 /
- Transformer /
- 生成对抗网络 /
- 多项损失函数
Abstract:
Owing to the diversity of underwater environments and the scattering and selective absorption of light in water, acquired underwater images usually suffer from severe quality degradation problems, such as color deviation, low clarity, and low brightness. To solve these problems, an underwater image enhancement algorithm that combines a transformer and generative adversarial network is proposed. Based on the generative adversarial network, a generative adversarial network with transformer (TGAN) network enhancement model is constructed by combining the coding and decoding structure, global feature modeling transformer module based on the spatial self-attention mechanism, and channel-level multi-scale feature fusion transformer module. The model focuses on color and spatial channels with more serious underwater image attenuation. This effectively enhances the image details and solves the color-deviation problem. Additionally, a multinomial loss function, combining RGB and LAB color spaces, is designed to constrain the adversarial training of the network enhancement model. The experimental results demonstrate that when compared to typical underwater image enhancement algorithms, such as contrast-limited adaptive histogram equalization (CLAHE), underwater dark channel prior (UDCP), underwater based on convolutional neural network (UWCNN), and fast underwater image enhancement for improved visual perception (FUnIE-GAN), the proposed algorithm can significantly improve the clarity, detail texture, and color performance of underwater images. Specifically, the average values of the objective evaluation metrics, including the peak signal-to-noise ratio, structural similarity index, and underwater image quality measure, improve by 5.8%, 1.8%, and 3.6%, respectively. The proposed algorithm effectively improves the visual perception of underwater images.
- image processing /
- underwater image enhancement /
- Transformer /
- generative adversarial network /
- multinomial loss function

HTML全文

图 1 生成对抗网络结构

Figure 1. Generative adversarial network structure

下载: 全尺寸图片幻灯片

图 2 单层Transformer模型

Figure 2. A single-layer Transformer model

下载: 全尺寸图片幻灯片

图 3 TGAN网络结构

Figure 3. TGAN network structure

下载: 全尺寸图片幻灯片

图 4 GFMT模块结构

Figure 4. GFMT module structure

下载: 全尺寸图片幻灯片

图 5 MSFFT模块的详细结构

Figure 5. Detailed structure of MSFFT module

下载: 全尺寸图片幻灯片

图 6 消融实验定性对比(a)水下图像；(b)BL；(c)BL+GFMT；(d)BL+MSFFT；(e)BL+Loss_LAB；(f)TGAN；(g)参考图像

Figure 6. Qualitative comparison of ablation experiments. (a)Underwater images; (b)BL; (c)BL+GFMT; (d)BL+MSFFT; (e)BL+Loss_LAB; (f)TGAN; (g)Reference images

下载: 全尺寸图片幻灯片

图 7 不同方法在测试集Test-1上的定性对比(a)水下图像；(b)CLAHE；(c)RGHS；(d)UDCP；(e)IBLA；(f)UWCNN；(g)FUnIE-GAN；(h)DGD-cGAN；(i)本文方法；(j)参考图像

Figure 7. Qualitative comparison of different methods on Test-1. (a)Underwater images; (b)CLAHE; (c)RGHS; (d)UDCP; (e)IBLA; (f)UWCNN; (g)FUnIE-GAN; (h)DGD-cGAN; (i)Our method; (j)Reference images

下载: 全尺寸图片幻灯片

图 8 不同方法在测试集Test-2上的定性对比(a)水下图像；(b)CLAHE；(c)RGHS；(d)UDCP；(e)IBLA；(f)UWCNN；(g)FUnIE-GAN；(h)DGD-cGAN；(i)本文方法

Figure 8. Qualitative comparison of different methods on Test-2. (a)Underwater images; (b)CLAHE; (c)RGHS; (d)UDCP; (e)IBLA; (f)UWCNN; (g)FUnIE-GAN; (h)DGD-cGAN; (i)Our method

下载: 全尺寸图片幻灯片

表 1 在测试集Test-1上的消融实验结果

Table 1 Experimental results of ablation study on Test-1

Models	PSNR	SSIM
BL	19.2556	0.7014
BL+GFMT	21.6849	0.7635
BL+MSFFT	22.3719	0.7813
BL+Loss_LAB	21.4161	0.7281
TGAN	24.0546	0.8257

下载: 导出CSV

表 2 不同方法在测试集Test-1上的定量对比

Table 2 Quantitative comparison of different methods on Test-1

Methods	PSNR	SSIM
CLAHE	18.4342	0.7653
RGHS	18.2053	0.7672
UDCP	14.0555	0.5650
IBLA	19.9222	0.7487
UWCNN	18.1209	0.7420
FUnIE-GAN	22.7413	0.8112
DGD-cGAN	17.3954	0.6955
TGAN	24.0546	0.8257

下载: 导出CSV

表 3 不同方法在测试集Test-2的定量对比

Table 3 Quantitative comparison of different methods on Test-2

Methods	UCIQE	UIQM	NIQE
CLAHE	0.4516	3.1570	6.5814
RGHS	0.4673	2.4674	6.4705
UDCP	0.4216	2.0992	5.7852
IBLA	0.4731	2.3331	5.7619
UWCNN	0.3508	3.0378	6.7935
FUnIE-GAN	0.4314	3.0997	6.2796
DGD-cGAN	0.3689	3.1810	7.2689
TGAN	0.4846	3.2963	5.7743

下载: 导出CSV

参考文献(24)

[1]	YANG M, HU J T, LI C Y, et al. An in-depth survey of underwater image enhancement and restoration[J]. IEEE Access, 2019, 7: 123638-123657. DOI: 10.1109/ACCESS.2019.2932611
[2]	ANWAR S, LI C Y. Diving deeper into underwater image enhancement: a survey[J]. Signal Processing: Image Communication, 2020, 89: 115978. DOI: 10.1016/j.image.2020.115978
[3]	Islam M J, XIA Y, Sattar J. Fast underwater image enhancement for improved visual perception[J]. IEEE Robotics and Automation Letters, 2020, 5: 3227-3234. DOI: 10.1109/LRA.2020.2974710
[4]	晋玮佩, 郭继昌, 祁清. 基于条件生成对抗网络的水下图像增强[J]. 激光与光电子学进展, 2020, 57(14): 141002. JIN W P, GUO J C, QI Q. Underwater image enhancement based on conditional generative adversarial network[J]. Laser & Optoelectronics Progress, 2020, 57(14): 141002.
[5]	Hitam M S, Awalludin E A, Yussof W N J H W, et al. Mixture contrast limited adaptive histogram equalization for underwater image enhancement[C]//International Conference on Computer Applications Technology (ICCAT), 2013: 1-5.
[6]	HUANG D M, WANG Y, SONG W, et al. Shallow-water image enhancement using relative global histogram stretching based on adaptive parameter acquisition[C]//24th International Conference on MultiMedia Modeling (MMM), 2018(10704): 453-465.
[7]	Drews Paulo, Nascimento E, Moraes F, et al. Transmission estimation in underwater single images[C]//IEEE International Conference on Computer Vision Workshops (ICCVW), 2013: 825-830.
[8]	PENG Y T, Cosman P C. Underwater image restoration based on image blurriness and light absorption[J]. IEEE Transactions on Image Processing, 2017, 26(4): 1579-1594. DOI: 10.1109/TIP.2017.2663846
[9]	LI C Y, Anwar S, Porikli F. Underwater scene prior inspired deep underwater image and video enhancement[J]. Pattern Recognition, 2020, 98: 107038-107038. DOI: 10.1016/j.patcog.2019.107038
[10]	Gonzalez Sabbagh S, Robles Kelly A, Gao S. DGD-cGAN: a dual generator for image dewatering and restoration[J]. arXiv preprint arXiv: 2211.10026, 2022.
[11]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 1049-5258.
[12]	LI C, Wand M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]//Computer Vision–ECCV, 2016: 702-716.
[13]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recog-nition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 7780459.
[14]	WANG H, CAO P, WANG J, et al. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3): 2441-2449.
[15]	Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization: the missing ingredient for fast stylization[J]. arXiv preprint arXiv: 1607.08022, 2016.
[16]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv: 2010.11929, 2020.
[17]	Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//Computer Vision–ECCV, 2016: 694-711.
[18]	PENG L T, ZHU C L, BIAN L H. U-shape transformer for underwater image enhancement[C]//Computer Vision–ECCV, 2023: 290-307.
[19]	LI C Y, GUO C L, REN W Q, et al. An underwater image enhancement benchmark dataset and beyond[J]. IEEE Transactions on Image Processing, 2020, 29: 4376-4389. DOI: 10.1109/TIP.2019.2955241
[20]	Korhonen J, You J. Peak signal-to-noise ratio revisited: Is simple beautiful[C]//Fourth International Workshop on Quality of Multimedia Experience, 2012: 37-38.
[21]	Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM[C]//20th International Conference on Pattern Recognition, 2010: 2366-2369.
[22]	YANG M, Sowmya A. An underwater color image quality evaluation metric[J]. IEEE Transactions on Image Processing, 2015, 24(12): 6062-6071. DOI: 10.1109/TIP.2015.2491020
[23]	Panetta K, GAO C, Agaian S. Human-Visual-System-Inspired Underwater Image Quality Measures[J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. DOI: 10.1109/JOE.2015.2469915
[24]	Mittal A, Soundararajan R, Bovik A. Making a "Completely Blind" image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. DOI: 10.1109/LSP.2012.2227726

施引文献

资源附件(0)

图(8) / 表(3)

计量

文章访问数: 145
HTML全文浏览量: 7
PDF下载量: 48
被引次数: 0

结合Transformer与生成对抗网络的水下图像增强算法

作者简介: 袁红春（1971-），男，博士，教授，主要研究方向为智能信息处理

通讯作者: 张波（1996-），男，硕士研究生，主要研究方向为水下图像增强。E-mail：806232525@qq.com

计量

出版历程

Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network

计量

出版历程

目录

作者简介:
袁红春（1971-），男，博士，教授，主要研究方向为智能信息处理

通讯作者:
张波（1996-），男，硕士研究生，主要研究方向为水下图像增强。E-mail：806232525@qq.com