结合Transformer与生成对抗网络的水下图像增强算法

袁红春, 张波, 程心

袁红春, 张波, 程心. 结合Transformer与生成对抗网络的水下图像增强算法[J]. 红外技术, 2024, 46(9): 975-983.
引用本文: 袁红春, 张波, 程心. 结合Transformer与生成对抗网络的水下图像增强算法[J]. 红外技术, 2024, 46(9): 975-983.
YUAN Hongchun, ZHANG Bo, CHENG Xin. Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network[J]. Infrared Technology , 2024, 46(9): 975-983.
Citation: YUAN Hongchun, ZHANG Bo, CHENG Xin. Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network[J]. Infrared Technology , 2024, 46(9): 975-983.

结合Transformer与生成对抗网络的水下图像增强算法

基金项目: 

国家自然科学基金 41776142

详细信息
    作者简介:

    袁红春(1971-),男,博士,教授,主要研究方向为智能信息处理

    通讯作者:

    张波(1996-),男,硕士研究生,主要研究方向为水下图像增强。E-mail:806232525@qq.com

  • 中图分类号: TP391.4

Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network

  • 摘要:

    由于水下环境的多样性和光在水中受到的散射及选择性吸收作用,采集到的水下图像通常会产生严重的质量退化问题,如颜色偏差、清晰度低和亮度低等,为解决以上问题,本文提出了一种基于Transformer和生成对抗网络的水下图像增强算法。以生成对抗网络为基础架构,结合编码解码结构、基于空间自注意力机制的全局特征建模Transformer模块和通道级多尺度特征融合Transformer模块构建了TGAN(generative adversarial network with transformer)网络增强模型,重点关注水下图像衰减更严重的颜色通道和空间区域,有效增强了图像细节并解决了颜色偏差问题。此外,设计了一种结合RGB和LAB颜色空间的多项损失函数,约束网络增强模型的对抗训练。实验结果表明,与CLAHE(contrast limited adaptive histogram equalization)、UDCP(underwater dark channel prior)、UWCNN(underwater based on convolutional neural network)、FUnIE-GAN(fast underwater image enhancement for improved visual perception)等典型水下图像增强算法相比,所提算法增强后的水下图像在清晰度、细节纹理和色彩表现等方面都有所提升,客观评价指标如峰值信噪比、结构相似性和水下图像质量度量的平均值分别提升了5.8%、1.8%和3.6%,有效地提升了水下图像的视觉感知效果。

    Abstract:

    Owing to the diversity of underwater environments and the scattering and selective absorption of light in water, acquired underwater images usually suffer from severe quality degradation problems, such as color deviation, low clarity, and low brightness. To solve these problems, an underwater image enhancement algorithm that combines a transformer and generative adversarial network is proposed. Based on the generative adversarial network, a generative adversarial network with transformer (TGAN) network enhancement model is constructed by combining the coding and decoding structure, global feature modeling transformer module based on the spatial self-attention mechanism, and channel-level multi-scale feature fusion transformer module. The model focuses on color and spatial channels with more serious underwater image attenuation. This effectively enhances the image details and solves the color-deviation problem. Additionally, a multinomial loss function, combining RGB and LAB color spaces, is designed to constrain the adversarial training of the network enhancement model. The experimental results demonstrate that when compared to typical underwater image enhancement algorithms, such as contrast-limited adaptive histogram equalization (CLAHE), underwater dark channel prior (UDCP), underwater based on convolutional neural network (UWCNN), and fast underwater image enhancement for improved visual perception (FUnIE-GAN), the proposed algorithm can significantly improve the clarity, detail texture, and color performance of underwater images. Specifically, the average values of the objective evaluation metrics, including the peak signal-to-noise ratio, structural similarity index, and underwater image quality measure, improve by 5.8%, 1.8%, and 3.6%, respectively. The proposed algorithm effectively improves the visual perception of underwater images.

  • 图  1   生成对抗网络结构

    Figure  1.   Generative adversarial network structure

    图  2   单层Transformer模型

    Figure  2.   A single-layer Transformer model

    图  3   TGAN网络结构

    Figure  3.   TGAN network structure

    图  4   GFMT模块结构

    Figure  4.   GFMT module structure

    图  5   MSFFT模块的详细结构

    Figure  5.   Detailed structure of MSFFT module

    图  6   消融实验定性对比(a)水下图像;(b)BL;(c)BL+GFMT;(d)BL+MSFFT;(e)BL+LossLAB;(f)TGAN;(g)参考图像

    Figure  6.   Qualitative comparison of ablation experiments. (a)Underwater images; (b)BL; (c)BL+GFMT; (d)BL+MSFFT; (e)BL+LossLAB; (f)TGAN; (g)Reference images

    图  7   不同方法在测试集Test-1上的定性对比(a)水下图像;(b)CLAHE;(c)RGHS;(d)UDCP;(e)IBLA;(f)UWCNN;(g)FUnIE-GAN;(h)DGD-cGAN;(i)本文方法;(j)参考图像

    Figure  7.   Qualitative comparison of different methods on Test-1. (a)Underwater images; (b)CLAHE; (c)RGHS; (d)UDCP; (e)IBLA; (f)UWCNN; (g)FUnIE-GAN; (h)DGD-cGAN; (i)Our method; (j)Reference images

    图  8   不同方法在测试集Test-2上的定性对比(a)水下图像;(b)CLAHE;(c)RGHS;(d)UDCP;(e)IBLA;(f)UWCNN;(g)FUnIE-GAN;(h)DGD-cGAN;(i)本文方法

    Figure  8.   Qualitative comparison of different methods on Test-2. (a)Underwater images; (b)CLAHE; (c)RGHS; (d)UDCP; (e)IBLA; (f)UWCNN; (g)FUnIE-GAN; (h)DGD-cGAN; (i)Our method

    表  1   在测试集Test-1上的消融实验结果

    Table  1   Experimental results of ablation study on Test-1

    Models PSNR SSIM
    BL 19.2556 0.7014
    BL+GFMT 21.6849 0.7635
    BL+MSFFT 22.3719 0.7813
    BL+LossLAB 21.4161 0.7281
    TGAN 24.0546 0.8257
    下载: 导出CSV

    表  2   不同方法在测试集Test-1上的定量对比

    Table  2   Quantitative comparison of different methods on Test-1

    Methods PSNR SSIM
    CLAHE 18.4342 0.7653
    RGHS 18.2053 0.7672
    UDCP 14.0555 0.5650
    IBLA 19.9222 0.7487
    UWCNN 18.1209 0.7420
    FUnIE-GAN 22.7413 0.8112
    DGD-cGAN 17.3954 0.6955
    TGAN 24.0546 0.8257
    下载: 导出CSV

    表  3   不同方法在测试集Test-2的定量对比

    Table  3   Quantitative comparison of different methods on Test-2

    Methods UCIQE UIQM NIQE
    CLAHE 0.4516 3.1570 6.5814
    RGHS 0.4673 2.4674 6.4705
    UDCP 0.4216 2.0992 5.7852
    IBLA 0.4731 2.3331 5.7619
    UWCNN 0.3508 3.0378 6.7935
    FUnIE-GAN 0.4314 3.0997 6.2796
    DGD-cGAN 0.3689 3.1810 7.2689
    TGAN 0.4846 3.2963 5.7743
    下载: 导出CSV
  • [1]

    YANG M, HU J T, LI C Y, et al. An in-depth survey of underwater image enhancement and restoration[J]. IEEE Access, 2019, 7: 123638-123657. DOI: 10.1109/ACCESS.2019.2932611

    [2]

    ANWAR S, LI C Y. Diving deeper into underwater image enhancement: a survey[J]. Signal Processing: Image Communication, 2020, 89: 115978. DOI: 10.1016/j.image.2020.115978

    [3]

    Islam M J, XIA Y, Sattar J. Fast underwater image enhancement for improved visual perception[J]. IEEE Robotics and Automation Letters, 2020, 5: 3227-3234. DOI: 10.1109/LRA.2020.2974710

    [4] 晋玮佩, 郭继昌, 祁清. 基于条件生成对抗网络的水下图像增强[J]. 激光与光电子学进展, 2020, 57(14): 141002.

    JIN W P, GUO J C, QI Q. Underwater image enhancement based on conditional generative adversarial network[J]. Laser & Optoelectronics Progress, 2020, 57(14): 141002.

    [5]

    Hitam M S, Awalludin E A, Yussof W N J H W, et al. Mixture contrast limited adaptive histogram equalization for underwater image enhancement[C]//International Conference on Computer Applications Technology (ICCAT), 2013: 1-5.

    [6]

    HUANG D M, WANG Y, SONG W, et al. Shallow-water image enhancement using relative global histogram stretching based on adaptive parameter acquisition[C]//24th International Conference on MultiMedia Modeling (MMM), 2018(10704): 453-465.

    [7]

    Drews Paulo, Nascimento E, Moraes F, et al. Transmission estimation in underwater single images[C]//IEEE International Conference on Computer Vision Workshops (ICCVW), 2013: 825-830.

    [8]

    PENG Y T, Cosman P C. Underwater image restoration based on image blurriness and light absorption[J]. IEEE Transactions on Image Processing, 2017, 26(4): 1579-1594. DOI: 10.1109/TIP.2017.2663846

    [9]

    LI C Y, Anwar S, Porikli F. Underwater scene prior inspired deep underwater image and video enhancement[J]. Pattern Recognition, 2020, 98: 107038-107038. DOI: 10.1016/j.patcog.2019.107038

    [10]

    Gonzalez Sabbagh S, Robles Kelly A, Gao S. DGD-cGAN: a dual generator for image dewatering and restoration[J]. arXiv preprint arXiv: 2211.10026, 2022.

    [11]

    Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 1049-5258.

    [12]

    LI C, Wand M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]//Computer Vision–ECCV, 2016: 702-716.

    [13]

    HE K, ZHANG X, REN S, et al. Deep residual learning for image recog-nition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 7780459.

    [14]

    WANG H, CAO P, WANG J, et al. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3): 2441-2449.

    [15]

    Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization: the missing ingredient for fast stylization[J]. arXiv preprint arXiv: 1607.08022, 2016.

    [16]

    Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv: 2010.11929, 2020.

    [17]

    Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//Computer Vision–ECCV, 2016: 694-711.

    [18]

    PENG L T, ZHU C L, BIAN L H. U-shape transformer for underwater image enhancement[C]//Computer Vision–ECCV, 2023: 290-307.

    [19]

    LI C Y, GUO C L, REN W Q, et al. An underwater image enhancement benchmark dataset and beyond[J]. IEEE Transactions on Image Processing, 2020, 29: 4376-4389. DOI: 10.1109/TIP.2019.2955241

    [20]

    Korhonen J, You J. Peak signal-to-noise ratio revisited: Is simple beautiful[C]//Fourth International Workshop on Quality of Multimedia Experience, 2012: 37-38.

    [21]

    Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM[C]//20th International Conference on Pattern Recognition, 2010: 2366-2369.

    [22]

    YANG M, Sowmya A. An underwater color image quality evaluation metric[J]. IEEE Transactions on Image Processing, 2015, 24(12): 6062-6071. DOI: 10.1109/TIP.2015.2491020

    [23]

    Panetta K, GAO C, Agaian S. Human-Visual-System-Inspired Underwater Image Quality Measures[J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. DOI: 10.1109/JOE.2015.2469915

    [24]

    Mittal A, Soundararajan R, Bovik A. Making a "Completely Blind" image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. DOI: 10.1109/LSP.2012.2227726

图(8)  /  表(3)
计量
  • 文章访问数:  145
  • HTML全文浏览量:  7
  • PDF下载量:  48
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-04-08
  • 修回日期:  2023-05-16
  • 刊出日期:  2024-09-19

目录

    /

    返回文章
    返回