Citation: | YUAN Hongchun, ZHANG Bo, CHENG Xin. Underwater Image Enhancement Algorithm Combining Transformer and Generative Adversarial Network[J]. Infrared Technology , 2024, 46(9): 975-983. |
Owing to the diversity of underwater environments and the scattering and selective absorption of light in water, acquired underwater images usually suffer from severe quality degradation problems, such as color deviation, low clarity, and low brightness. To solve these problems, an underwater image enhancement algorithm that combines a transformer and generative adversarial network is proposed. Based on the generative adversarial network, a generative adversarial network with transformer (TGAN) network enhancement model is constructed by combining the coding and decoding structure, global feature modeling transformer module based on the spatial self-attention mechanism, and channel-level multi-scale feature fusion transformer module. The model focuses on color and spatial channels with more serious underwater image attenuation. This effectively enhances the image details and solves the color-deviation problem. Additionally, a multinomial loss function, combining RGB and LAB color spaces, is designed to constrain the adversarial training of the network enhancement model. The experimental results demonstrate that when compared to typical underwater image enhancement algorithms, such as contrast-limited adaptive histogram equalization (CLAHE), underwater dark channel prior (UDCP), underwater based on convolutional neural network (UWCNN), and fast underwater image enhancement for improved visual perception (FUnIE-GAN), the proposed algorithm can significantly improve the clarity, detail texture, and color performance of underwater images. Specifically, the average values of the objective evaluation metrics, including the peak signal-to-noise ratio, structural similarity index, and underwater image quality measure, improve by 5.8%, 1.8%, and 3.6%, respectively. The proposed algorithm effectively improves the visual perception of underwater images.
[1] |
YANG M, HU J T, LI C Y, et al. An in-depth survey of underwater image enhancement and restoration[J]. IEEE Access, 2019, 7: 123638-123657. DOI: 10.1109/ACCESS.2019.2932611
|
[2] |
ANWAR S, LI C Y. Diving deeper into underwater image enhancement: a survey[J]. Signal Processing: Image Communication, 2020, 89: 115978. DOI: 10.1016/j.image.2020.115978
|
[3] |
Islam M J, XIA Y, Sattar J. Fast underwater image enhancement for improved visual perception[J]. IEEE Robotics and Automation Letters, 2020, 5: 3227-3234. DOI: 10.1109/LRA.2020.2974710
|
[4] |
晋玮佩, 郭继昌, 祁清. 基于条件生成对抗网络的水下图像增强[J]. 激光与光电子学进展, 2020, 57(14): 141002.
JIN W P, GUO J C, QI Q. Underwater image enhancement based on conditional generative adversarial network[J]. Laser & Optoelectronics Progress, 2020, 57(14): 141002.
|
[5] |
Hitam M S, Awalludin E A, Yussof W N J H W, et al. Mixture contrast limited adaptive histogram equalization for underwater image enhancement[C]//International Conference on Computer Applications Technology (ICCAT), 2013: 1-5.
|
[6] |
HUANG D M, WANG Y, SONG W, et al. Shallow-water image enhancement using relative global histogram stretching based on adaptive parameter acquisition[C]//24th International Conference on MultiMedia Modeling (MMM), 2018(10704): 453-465.
|
[7] |
Drews Paulo, Nascimento E, Moraes F, et al. Transmission estimation in underwater single images[C]//IEEE International Conference on Computer Vision Workshops (ICCVW), 2013: 825-830.
|
[8] |
PENG Y T, Cosman P C. Underwater image restoration based on image blurriness and light absorption[J]. IEEE Transactions on Image Processing, 2017, 26(4): 1579-1594. DOI: 10.1109/TIP.2017.2663846
|
[9] |
LI C Y, Anwar S, Porikli F. Underwater scene prior inspired deep underwater image and video enhancement[J]. Pattern Recognition, 2020, 98: 107038-107038. DOI: 10.1016/j.patcog.2019.107038
|
[10] |
Gonzalez Sabbagh S, Robles Kelly A, Gao S. DGD-cGAN: a dual generator for image dewatering and restoration[J]. arXiv preprint arXiv: 2211.10026, 2022.
|
[11] |
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 1049-5258.
|
[12] |
LI C, Wand M. Precomputed real-time texture synthesis with markovian generative adversarial networks[C]//Computer Vision–ECCV, 2016: 702-716.
|
[13] |
HE K, ZHANG X, REN S, et al. Deep residual learning for image recog-nition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 7780459.
|
[14] |
WANG H, CAO P, WANG J, et al. Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3): 2441-2449.
|
[15] |
Ulyanov D, Vedaldi A, Lempitsky V. Instance normalization: the missing ingredient for fast stylization[J]. arXiv preprint arXiv: 1607.08022, 2016.
|
[16] |
Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv: 2010.11929, 2020.
|
[17] |
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution[C]//Computer Vision–ECCV, 2016: 694-711.
|
[18] |
PENG L T, ZHU C L, BIAN L H. U-shape transformer for underwater image enhancement[C]//Computer Vision–ECCV, 2023: 290-307.
|
[19] |
LI C Y, GUO C L, REN W Q, et al. An underwater image enhancement benchmark dataset and beyond[J]. IEEE Transactions on Image Processing, 2020, 29: 4376-4389. DOI: 10.1109/TIP.2019.2955241
|
[20] |
Korhonen J, You J. Peak signal-to-noise ratio revisited: Is simple beautiful[C]//Fourth International Workshop on Quality of Multimedia Experience, 2012: 37-38.
|
[21] |
Horé A, Ziou D. Image quality metrics: PSNR vs. SSIM[C]//20th International Conference on Pattern Recognition, 2010: 2366-2369.
|
[22] |
YANG M, Sowmya A. An underwater color image quality evaluation metric[J]. IEEE Transactions on Image Processing, 2015, 24(12): 6062-6071. DOI: 10.1109/TIP.2015.2491020
|
[23] |
Panetta K, GAO C, Agaian S. Human-Visual-System-Inspired Underwater Image Quality Measures[J]. IEEE Journal of Oceanic Engineering, 2016, 41(3): 541-551. DOI: 10.1109/JOE.2015.2469915
|
[24] |
Mittal A, Soundararajan R, Bovik A. Making a "Completely Blind" image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. DOI: 10.1109/LSP.2012.2227726
|
[1] | LIAO Guangfeng, GUAN Zhiwei, CHEN Qiang. An Improved Dual Discriminator Generative Adversarial Network Algorithm for Infrared and Visible Image Fusion[J]. Infrared Technology , 2025, 47(3): 367-375. |
[2] | DAI Yueming, YANG Lufeng, TONG Xiongmin. Real-time Section State Verification Method of Energy Management System Low Voltage Equipment Based on Infrared Image and Deep Learning[J]. Infrared Technology , 2024, 46(12): 1464-1470. |
[3] | CHEN Haipeng, JIN Weiqi, LI Li, QIU Su, YU Xiangzhi. Study on BRDF Scattering Characteristics of Relay Wall in Non-Line-of-Sight Imaging Based on Time-gated SPAD Array[J]. Infrared Technology , 2024, 46(11): 1225-1234. |
[4] | ZHONG Guoli, LIAO Shouyi, YANG Xinjie. Real-Time Infrared Image Generation of Battlefield Environment Based on JRM[J]. Infrared Technology , 2024, 46(2): 183-189. |
[5] | SHEN Ji, NA Qiyue, XU Jiandong, CHANG Weijing, ZHANG Wei, JIAN Yunfei. 640×512 Frame Transfer EMCCD Camera Timing Sequence Design[J]. Infrared Technology , 2023, 45(5): 548-552. |
[6] | WANG Mingxing, ZHENG Fu, WANG Yanqiu, SUN Zhibin. Time-of-Flight Point Cloud Denoising Method Based on Confidence Level[J]. Infrared Technology , 2022, 44(5): 513-520. |
[7] | LIU Zhaoqing, LI Li, DONG Bing, JIN Weiqi. Shack-Hartman Detector Real-time Wavefront Processor Based on FPGA[J]. Infrared Technology , 2021, 43(8): 717-722. |
[8] | CHEN Zheng, FU Kuisheng, DING Haishan. Analysis of the Influence of Installation Errors of an Infrared Stabilized Platform on Line-of-sight Angular Velocity[J]. Infrared Technology , 2021, 43(2): 110-115. |
[9] | WEI Jiali, QU Huidong, WANG Yongxian, ZHU Junqing, GUAN Yingjun. Research Review of 3D Cameras Based on Time-of-Flight Method[J]. Infrared Technology , 2021, 43(1): 60-67. |
[10] | HUANG Minshuang, GUAN Zaihui, JIANG Bo. Pulse Laser Ranging Using Sinusoidal Amplitude Time Conversion[J]. Infrared Technology , 2020, 42(5): 483-487. |