Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network
-
摘要:
由于水下环境的多样性和光在水中受到的散射及选择性吸收作用,采集到的水下图像通常会产生严重的质量退化问题,如颜色偏差、清晰度低和亮度低等,为解决以上问题,本文提出了一种基于Transformer和生成对抗网络的水下图像增强算法。以生成对抗网络为基础架构,结合编码解码结构、基于空间自注意力机制的全局特征建模Transformer模块和通道级多尺度特征融合Transformer模块构建了TGAN(generative adversarial network with transformer)网络增强模型,重点关注水下图像衰减更严重的颜色通道和空间区域,有效增强了图像细节并解决了颜色偏差问题。此外,设计了一种结合RGB和LAB颜色空间的多项损失函数,约束网络增强模型的对抗训练。实验结果表明,与CLAHE(contrast limited adaptive histogram equalization)、UDCP(underwater dark channel prior)、UWCNN(underwater based on convolutional neural network)、FUnIE-GAN(fast underwater image enhancement for improved visual perception)等典型水下图像增强算法相比,所提算法增强后的水下图像在清晰度、细节纹理和色彩表现等方面都有所提升,客观评价指标如峰值信噪比、结构相似性和水下图像质量度量的平均值分别提升了5.8%、1.8%和3.6%,有效地提升了水下图像的视觉感知效果。
-
关键词:
- 图像处理 /
- 水下图像增强 /
- Transformer /
- 生成对抗网络 /
- 多项损失函数
Abstract:Owing to the diversity of underwater environments and the scattering and selective absorption of light in water, acquired underwater images usually suffer from severe quality degradation problems, such as color deviation, low clarity, and low brightness. To solve these problems, an underwater image enhancement algorithm that combines a transformer and generative adversarial network is proposed. Based on the generative adversarial network, a generative adversarial network with transformer (TGAN) network enhancement model is constructed by combining the coding and decoding structure, global feature modeling transformer module based on the spatial self-attention mechanism, and channel-level multi-scale feature fusion transformer module. The model focuses on color and spatial channels with more serious underwater image attenuation. This effectively enhances the image details and solves the color-deviation problem. Additionally, a multinomial loss function, combining RGB and LAB color spaces, is designed to constrain the adversarial training of the network enhancement model. The experimental results demonstrate that when compared to typical underwater image enhancement algorithms, such as contrast-limited adaptive histogram equalization (CLAHE), underwater dark channel prior (UDCP), underwater based on convolutional neural network (UWCNN), and fast underwater image enhancement for improved visual perception (FUnIE-GAN), the proposed algorithm can significantly improve the clarity, detail texture, and color performance of underwater images. Specifically, the average values of the objective evaluation metrics, including the peak signal-to-noise ratio, structural similarity index, and underwater image quality measure, improve by 5.8%, 1.8%, and 3.6%, respectively. The proposed algorithm effectively improves the visual perception of underwater images.
-
-
图 7 不同方法在测试集Test-1上的定性对比(a)水下图像;(b)CLAHE;(c)RGHS;(d)UDCP;(e)IBLA;(f)UWCNN;(g)FUnIE-GAN;(h)DGD-cGAN;(i)本文方法;(j)参考图像
Figure 7. Qualitative comparison of different methods on Test-1. (a)Underwater images; (b)CLAHE; (c)RGHS; (d)UDCP; (e)IBLA; (f)UWCNN; (g)FUnIE-GAN; (h)DGD-cGAN; (i)Our method; (j)Reference images
表 1 在测试集Test-1上的消融实验结果
Table 1 Experimental results of ablation study on Test-1
Models PSNR SSIM BL 19.2556 0.7014 BL+GFMT 21.6849 0.7635 BL+MSFFT 22.3719 0.7813 BL+LossLAB 21.4161 0.7281 TGAN 24.0546 0.8257 表 2 不同方法在测试集Test-1上的定量对比
Table 2 Quantitative comparison of different methods on Test-1
Methods PSNR SSIM CLAHE 18.4342 0.7653 RGHS 18.2053 0.7672 UDCP 14.0555 0.5650 IBLA 19.9222 0.7487 UWCNN 18.1209 0.7420 FUnIE-GAN 22.7413 0.8112 DGD-cGAN 17.3954 0.6955 TGAN 24.0546 0.8257 表 3 不同方法在测试集Test-2的定量对比
Table 3 Quantitative comparison of different methods on Test-2
Methods UCIQE UIQM NIQE CLAHE 0.4516 3.1570 6.5814 RGHS 0.4673 2.4674 6.4705 UDCP 0.4216 2.0992 5.7852 IBLA 0.4731 2.3331 5.7619 UWCNN 0.3508 3.0378 6.7935 FUnIE-GAN 0.4314 3.0997 6.2796 DGD-cGAN 0.3689 3.1810 7.2689 TGAN 0.4846 3.2963 5.7743 -
[1] YANG M, HU J T, LI C Y, et al. An in-depth survey of underwater image enhancement and restoration[J]. IEEE Access, 2019, 7: 123638-123657. DOI: 10.1109/ACCESS.2019.2932611
[2] ANWAR S, LI C Y. Diving deeper into underwater image enhancement: a survey[J]. Signal Processing: Image Communication, 2020, 89: 115978. DOI: 10.1016/j.image.2020.115978
[3] Islam M J, XIA Y, Sattar J. Fast underwater image enhancement for improved visual perception[J]. IEEE Robotics and Automation Letters, 2020, 5: 3227-3234. DOI: 10.1109/LRA.2020.2974710
[4] 晋玮佩, 郭继昌, 祁清. 基于条件生成对抗网络的水下图像增强[J]. 激光与光电子学进展, 2020, 57(14): 141002. JIN W P, GUO J C, QI Q. Underwater image enhancement based on conditional generative adversarial network[J]. Laser & Optoelectronics Progress, 2020, 57(14): 141002.
[5] Hitam M S, Awalludin E A, Yussof W N J H W, et al. Mixture contrast limited adaptive histogram equalization for underwater image enhancement[C]//International Conference on Computer Applications Technology (ICCAT), 2013: 1-5.
[6] HUANG D M, WANG Y, SONG W, et al. Shallow-water image enhancement using relative global histogram stretching based on adaptive parameter acquisition[C]//24th International Conference on MultiMedia Modeling (MMM), 2018(10704): 453-465.
[7] Drews Paulo, Nascimento E, Moraes F, et al. Transmission estimation in underwater single images[C]//IEEE International Conference on Computer Vision Workshops (ICCVW), 2013: 825-830.
[8] PENG Y T, Cosman P C. Underwater image restoration based on image blurriness and light absorption[J]. IEEE Transactions on Image Processing, 2017, 26(4): 1579-1594. DOI: 10.1109/TIP.2017.2663846
[9] LI C Y, Anwar S, Porikli F. Underwater scene prior inspired deep underwater image and video enhancement[J]. Pattern Recognition, 2020, 98: 107038-107038. DOI: 10.1016/j.patcog.2019.107038
[10] Gonzalez Sabbagh S, Robles Kelly A, Gao S. DGD-cGAN: a dual generator for image dewatering and restoration[J]. arXiv preprint arXiv: 2211.10026, 2022.
[11] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 1049-5258.
[12] LI C, Wand M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]//Computer Vision–ECCV, 2016: 702-716.
[13] HE K, ZHANG X, REN S, et al. Deep residual learning for image recog-nition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 7780459.
[14] WANG H, CAO P, WANG J, et al. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3): 2441-2449.
[15] Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization: the missing ingredient for fast stylization[J]. arXiv preprint arXiv: 1607.08022, 2016.
[16] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv: 2010.11929, 2020.
[17] Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//Computer Vision–ECCV, 2016: 694-711.
[18] PENG L T, ZHU C L, BIAN L H. U-shape transformer for underwater image enhancement[C]//Computer Vision–ECCV, 2023: 290-307.
[19] LI C Y, GUO C L, REN W Q, et al. An underwater image enhancement benchmark dataset and beyond[J]. IEEE Transactions on Image Processing, 2020, 29: 4376-4389. DOI: 10.1109/TIP.2019.2955241
[20] Korhonen J, You J. Peak signal-to-noise ratio revisited: Is simple beautiful[C]//Fourth International Workshop on Quality of Multimedia Experience, 2012: 37-38.
[21] Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM[C]//20th International Conference on Pattern Recognition, 2010: 2366-2369.
[22] YANG M, Sowmya A. An underwater color image quality evaluation metric[J]. IEEE Transactions on Image Processing, 2015, 24(12): 6062-6071. DOI: 10.1109/TIP.2015.2491020
[23] Panetta K, GAO C, Agaian S. Human-Visual-System-Inspired Underwater Image Quality Measures[J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. DOI: 10.1109/JOE.2015.2469915
[24] Mittal A, Soundararajan R, Bovik A. Making a "Completely Blind" image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. DOI: 10.1109/LSP.2012.2227726