LU2Net: A Lightweight Network for Real-time
Underwater Image Enhancement

Haodong Yang, Jisheng Xu, Zhiliang Lin and Jian** He Haodong Yang is with the Department of Computer Science, Shanghai Jiao Tong University. Jisheng Xu, Zhiliang Lin and Jian** He are with the Department of Automation, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, and Shanghai Engineering Research Center of Intelligent Control and Management, Shanghai 200240, China. Zhiliang Lin is with the School of Ocean and Civil Engineering, Shanghai Jiao Tong University, State Key Laboratory of Ocean Engineering, Shanghai 200240, China. Emails: {yanghaodong, Jimmy_xu, linzhiliang, jphe}@sjtu.edu.cn.

Abstract

Computer vision techniques have empowered underwater robots to effectively undertake a multitude of tasks, including object tracking and path planning. However, underwater optical factors like light refraction and absorption present challenges to underwater vision, which cause degradation of underwater images. A variety of underwater image enhancement methods have been proposed to improve the effectiveness of underwater vision perception. Nevertheless, for real-time vision tasks on underwater robots, it is necessary to overcome the challenges associated with algorithmic efficiency and real-time capabilities. In this paper, we introduce Lightweight Underwater Unet (LU2Net), a novel U-shape network designed specifically for real-time enhancement of underwater images. The proposed model incorporates axial depthwise convolution and the channel attention module, enabling it to significantly reduce computational demands and model parameters, thereby improving processing speed. The extensive experiments conducted on the dataset and real-world underwater robots demonstrate the exceptional performance and speed of proposed model. It is capable of providing well-enhanced underwater images at a speed 8 times faster than the current state-of-the-art underwater image enhancement method. Moreover, LU2Net is able to handle real-time underwater video enhancement.

I INTRODUCTION

With the remarkable progress of computer vision in recent years, underwater vision is emerging as a prevalent and vital means of gathering information for underwater robots. A variety of vision models have been integrated into underwater robots for marine tasks, like object detection [1], monocular depth estimation [2] and visual odometry [3]. Real-time and high-quality underwater images provide essential information for robotic decision making and improve the safety and stability of underwater robots.

Though computer vision technologies strengthen the perception abilities of underwater robots, low-quality underwater images degrade the performance of marine robotic tasks. In contrast to the relatively stable onshore environment, the underwater environment is characterized by complexity and variability. Greater absorption of red light and scattering of suspended particles in water lead to imbalanced color channels and inconsistent blurriness in images [4]. Additionally, light sources may be insufficient in deep water, limiting the visibility of target objects and surroundings. The changeable underwater environment hugely affects the quality of underwater images. Thus, the computer vision models designed for onshore clear images and videos may suffer from decrease in performance when applied in underwater robots for marine vision tasks.

To restore degraded underwater images, researchers have proposed a variety of underwater image enhancement (UIE) methods. Traditional UIE Methods that are based on visual and physical prior [5, 6, 7, 8, 9, 10, 11, 12] are usually unable to adapt to various underwater environments and result in inappropriate enhancement. Deep neural networks for underwater image enhancement has been intensively investigated. UIEC^2-Net [13] integrates RGB and HSV color spaces in one single CNN, utilizing different image properties. PUGAN [14] estimates physical parameters to guide image enhancement in neural networks. Recently, [15] introduced transformer into UIE. However, the above-mentioned learning-based methods often focus on visual effects rather than processing speed. For underwater robots performing practical marine tasks, it is necessary to provide fast image enhancement.

To address the aforementioned issue, in this paper, we propose Lightweight Underwater UNet (LU2Net) for real-time underwater images enhancement. Axial depthwise convolution is embedded into LU2Net, which contributes to larger receptive fields. Thus more details are perceived with less convolution layers. Besides, channel attention module is integrated to adaptively adjust channel weights and mitigate inconsistent attenuation among channels. The experiments on the dataset and real-world robots demonstrate that our model outperforms state-of-the-art methods in both quantitative metrics and processing speed. Our main contributions can be summarized as:

Refer to caption — Figure 1: The illustration of LU2Net structure. Specially designed encoder and decoder blocks enable larger receptive fields and the adaptive adjustment of channel weights. Skip connection ensures the full utilization of multi-stage information.

$\bullet$

We propose a novel U-shape network, LU2Net, that integrates a lightweight convolution and adaptive channel weights for real-time underwater image enhancement. This lightweight U-shape model enables underwater robots to obtain high-quality and real-time images for marine vision tasks.
$\bullet$

Our specially designed block structure is embedded with channel-wise attention and axial convolution. This simple and fast block structure mitigates inconsistent attenuation in color channels and achieves large receptive fields, consuming a small amount of time and resources.
$\bullet$

Our model demonstrates significant advantages through experiments on a dataset and practical testing on underwater robots. Maintaining better performance, our model achieves 8 times faster speed than state-of-the-art model, showing LU2Net’s ability on videos.

II Related Works

II-A UIE Methods

Various methods have been proposed to enhance recorded single underwater images, which can be divided into the following three categories.

i) Visual prior methods: These approaches focus on adjusting pixel values to improve image quality. Early methods include histogram equalization [16], which transforms the image histogram from a narrow unimodal histogram to a balanced distribution histogram. Image fusion [17, 18] takes enhanced images by different algorithms as input and generate results by fusing them. Retinex-based methods [8] obtain the actual appearance of the object by estimating and removing the illumination light in the environment. More recently, morphology-based techniques [6, 7] have been introduced to underwater image enhancement. Visual prior methods inherently improve contrast and saturation in underwater images. However, they may lead to over-enhancement when applied to real-world underwater images.

ii) Physics-based methods: These methods utilize prior knowledge of the physics of underwater imaging to guide image enhancement. For example, UDCP algorithm [11] is based on the fact that most non-sky pixels in haze-free images possess at least one color channel with significantly low intensity. By estimating the weights of different channels through less attenuated blue and green channels, UDCP demonstrates effectiveness in processing deep-sea images. Another approach, RDCP [12], leverages the information of fastly decayed red light and considers both natural light and artificial light to achieve good enhancement results. However, as these methods focus on physical models and do not consider human visual perception, the visual quality of the enhanced images may be compromised.

iii) Deep learning methods: The emergence of deep learning has sparked interest in data-driven underwater image enhancement. CNN-based methods learn the map** from original underwater images to high-quality images. UWCNN [19] introduced a simple CNN architecture with stacked convolution layers for UIE tasks. UIEC^2-Net [13] utilizes image properties in various color spaces. Besides, a variety of GAN-based models have been proposed, which usually take non-image information for enhancement. WaterGAN [20] estimates and utilizes depth information. PUGAN [14] combines the physical model of underwater imaging and the generative adversarial network. Recently, transformer-based methods [15] are also introduced into UIE. Though pleasing visual effect is achieved, the importance of speed and net size is frequently ignored. Current deep learning methods often suffer from relatively slow processing speed and a reliance on high-end graphics processing hardware. This hinders their applicability for real-time vision tasks on robotic platforms with limited computing capabilities.

II-B Light Weight U-Net

U-Net employs a U-shape network with encoder-decoder architecture for biomedical image segmentation [21]. After the emergence of U-Net, the U-shape architecture has been widely adopted in various vision tasks. For instance, it has proven effective for image enhancement. Peng et al. [15] introduced U-shape structure into UIE task.

After the success of U-Net, researchers have explored methods to streamline U-shape networks, aiming to reduce computational complexity and improve efficiency. For example, Unext proposed in [22] combines a multi-layer perceptron with U-Net, resulting in less convolution layers and network parameters. In [23], U-Lite leverages depth separable convolution, significantly reducing the parameter count while maintaining high performance. Motivated from these works, in this paper, we utilize the idea from U-Lite for reduction of network size and complexity.

III METHODOLOGY

In this section, we introduce the structure of LU2Net. As illustrated in Fig. 1, U-shape network structure is utilized for multi-scale information extraction. A lightweight convolution architecture, axial depthwise convolution [23], is leveraged to achieve larger receptive fields with less layers and parameters. Besides, the channel attention module proposed in [24] is embedded in LU2Net, which adaptively adjust channel weights and effectively correct color distortion.

With these modules, novel encoder and decoder block structures are designed for improved performance and less parameters and shortened processing time. The novel block structures allow for reduced size and reinforced speed of our model while maintaining significant enhancement performance, which is illustrated in Section \@slowromancapiv@.

III-A U-Shape Network Architecture

As is shown in Fig. 1, LU2Net proposed in this paper adopts a streamlined U-shape network design, achieving high performance without excessive block stacking. Our U-shape network architecture consists of an classic encoder-decoder structure. Stacked encoder and decoder blocks provide a holistic view of multi-scale features. The skip connections in the network facilitate the transfer of information from earlier encoder stages to later decoder stages, improving the performance by incorporating multi-stage information.

Different from traditional U-Net [21], whose blocks are constructed by stacked convolution layers, the proposed model introduces novel encoder and decoder block structures. The blocks incorporate axial depthwise convolution, enabling larger receptive fields. Then, less convolution layers are required in our blocks. Channel attention module in blocks provide adaptive adjustment of channel weights, which provides a light and fast solution for underwater color distortion. Utilizing this block structures, LU2Net achieves state-of-the-art enhancement performance and outstanding processing speed with a small number of blocks.

III-B Axial Depthwise Convolution Module

Axial depthwise convolution is integrated in our model for higher performance while reducing parameters and computational complexity. The axial depthwise convolution combines depthwise separable convolution [25] and axial feature extraction, resulting in a novel convolution variant[23]. By replacing common convolution layers with axial depthwise convolution, the amount of parameters in blocks largely decreases while the performance is improved.

By transforming traditional square convolution kernels into axial variants, the receptive fields of axial depthwise convolution expand while reducing the number of parameters. Fig. 2 showcases the larger receptive fields of axial depthwise convolution compared to traditional convolution. This modification allows for leveraging more precise details from the input image, leading to superior performance with fewer parameters.

Axial depthwise convolution consists of two stages, as is shown in Fig. 3. Depthwise convolution keeps the depth of the output unchanged. Each pair of horizontal and vertical convolution kernels is individually applied to each input channel. Then, pointwise convolution is carried out, convolving across all input channels simultaneously. By dividing single convolution operation into two parts, axial depthwise convolution enables a massive reduction of parameters.

Input: A 3-dimension tensor

In

l

channels

Output: A 3-dimension tensor

Out

l

channels

2for $i\leftarrow 0$ to $l$ do

Hconv\left[i\right]

= Conv(

In\left[i\right]

horizontal=True

);

Vconv\left[i\right]

= Conv(

In\left[i\right]

vertical=True

);

DWconv[i]

Hconv\left[i\right]

Lconv\left[i\right]

In[i]

;

Out

= Conv(

DWconv

all=True

)

Algorithm 1 Axial Depthwise Convolution

III-C Channel Attention Module

The channel attention module [24] focuses on dynamically adjusting the weights of each channel based on its features. This module offers an effective solution for channel-specific degradation in underwater images. By incorporating the channel attention module, our model can learn the significance of different channels and assign higher weights to the most informative ones. Thus, we can reduce the inconsistent attenuation by addressing the varying degradation levels among channels.

The channel attention module comprises three main components. Firstly, global average pooling is performed on each channel, reducing the two-dimensional features to a single scalar weight value for each channel. Next, through an multi-layer perceptron (MLP) or a similar MLP-like architecture, the excitation process modifies the original weights to generate channel-wise weights. Finally, element-wise multiplication is applied between each channel and its corresponding weight, resulting in the final output. The process is represented in Fig. 4.

Input: A 3-dimension tensor

In

l

channels

Output: A 3-dimension adjusted tensor

Out

l

channels

2for $i\leftarrow 0$ to $l$ do

w_{0}\left[i\right]

= avgPool(

In\left[i\right]

);

w

= MLP(

w_{0}

);

6 for $i\leftarrow 0$ to $l$ do

Out\left[i\right]

w\left[i\right]\times In[i]

;

Algorithm 2 CALayer

III-D Loss Function Design

To attain a more comprehensive sense of image information and achieve a more pleasing visual effect, we design a loss function expressed as follows:

L_{total}=l_{RGB}+l_{LAB}+l_{LCH}+l_{SSIM}+l_{VGG}.

(1)

$l_{RGB}$ , $l_{LAB}$ , $l_{LCH}$ are mean square error in RGB, LAB and LCH color spaces. $l_{SSIM}$ is the SSIM loss. $l_{VGG}$ is the VGG loss. SSIM (Structural Similarity Index) evaluates the similarity of luminance, contrast and structure between output image and the ground truth. VGG loss compares the high-level features of two images after the process of the pretrained VGG network.

In the case of underwater image enhancement, where datasets are relatively small, it is essential to incorporate comprehensive image information from various perspectives to achieve superior performance. Therefore, our proposed loss function (1) leverages multiple image properties to encompass a broader range of image information, resulting in improved performance.

SSIM is a widely adopted metric for evaluating image quality, where a higher SSIM score typically indicates superior image quality. In line with the experimental findings reported in [15], it was observed that the RGB, LAB, and LCH color spaces yield the highest SSIM scores when applied to underwater images. Therefore, we incorporate these three color spaces and the SSIM loss into our loss function, incorporating multiply image characteristics.

Furthermore, to achieve a more perceptually pleasing enhancement result, we employ the VGG loss. The application of VGG loss ensures more satisfactory visual perceptual effects of enhanced images.

IV EXPERIMENTS

In this section, we begin by providing an overview of the experimental settings about the processing hardware and the underwater robotic platform on which the real-world experiments are conducted. Following this, training specifics for our Lightweight Underwater UNet model are presented. Then, training experiments conducted on a dataset compare the enhancement performance and processing speed of our model with other state-of-the-art models. Moreover, our model is integrated into the underwater remotely operated vehicle (ROV) to test the visual effects of enhanced images and processing speed in real-world underwater tasks.

IV-A Experimental Settings

Our experiments are conducted on an RTX 3060 laptop. Besides, the extensive experiment of LU2Net is carried out on an i7-10750H processor to test the speed without a graphics processing unit.

Fig. 5 illustrates the structure of the underwater ROV for real-world experiments and the experimental environment. Our underwater ROV is equipped with a water-proof camera for visual perception, which can output videos of 640 $\times$ 480 resolution at the speed of 30 frames per second (fps). The artificial light source provides visible light in poor lightning conditions. Thrusters attached to the body empower our ROV with motion ability.

IV-B Training Details

We use a comprehensive underwater image enhancement dataset [15], LSUI, which consists of 5000 pairs of original input images and corresponding ground truth images. This dataset significantly surpasses previous datasets in terms of size, making it an ideal choice for our experiments.

For the implementation of our model, we utilize the Python programming language and the PyTorch framework. Throughout the entire training process, which spans 150 epochs, we employ the Adam optimization algorithm. The initial learning rate is set to 0.0005, and every 40 epochs, the learning rate is reduced by 20 percent.

IV-C Model Evaluation

As is customary, we partition the LSUI dataset into a training dataset and a testing dataset, following an 8:2 split. The images in the dataset undergo resizing to dimensions of $256\times 256\times 3$ , and normalization is performed to ensure pixel values fall within the range of $[-1,1]$ .

To evaluate the enhancement performance and processing efficiency of our proposed method, we compare it with recent state-of-the-art UIE methods. Each model undergoes training on the designated training dataset for a total of 150 epochs.

During the comparison, we assess various metrics, including image quality, processing time, and parameter count, to provide a comprehensive evaluation of different methods.

IV-C1 Enhancement Performance Evaluation

MLLE serves as an example of a non-deep-learning approach for underwater image enhancement, while U-Trans stands out as a recently proposed deep learning method that achieves superior results. Additionally, UWCNN and WaterNet are notable examples of previous state-of-the-art UIE models.

MLLE inherits the limitations associated with traditional UIE methods, including color distortion and an inability to handle diverse underwater color and lighting conditions. In contrast, both U-Trans and our proposed model generate visually appealing enhanced images. These enhanced images enable improved identification of underwater objects and the surrounding environment, thereby facilitating higher-level tasks such as object tracking, object detection, and obstacle avoidance.

From the comparative analysis presented in Table. I and the illustration of enhanced underwater images in Fig. 6, it is evident that our model attains a state-of-the-art level of enhancement performance when compared to the selected methods. The evaluation metrics used include PSNR, SSIM and subjective visual quality assessment. Our model outperforms the other methods in terms of these metrics, indicating its superiority in enhancing underwater images. This achievement can be attributed to the effective utilization of advanced deep learning techniques and the incorporation of domain-specific knowledge in our model’s architecture.

TABLE I: Comparisons of enhancement performance

Method	PSNR	SSIM
MLLE	18.133	0.730
WaterNet	19.517	0.813
UWCNN	22.621	0.825
U-Trans	25.032	0.843
Ours	25.549	0.868

IV-C2 Model Speed and Parameters Comparison

The experimental results demonstrate that our model achieves a remarkable processing speed. Utilizing a common RTX 3060 laptop, our model is capable of outputting videos at 100 fps, thereby meeting the real-time demand of underwater tasks. Additionally, without a GPU, our model can still generate enhanced images at the speed of 12 fps on an i7-10750H processor.

Moreover, the experiment results exhibits the lowest demand for computational resources of our model, as indicated by the lowest FLOPs (Floating Point Operations) requirement. Furthermore, with a relatively small number of parameters, our model does not necessitate a large amount of memory. Consequently, our model is lightweight enough to be deployed on underwater robotic platforms while achieving impressive enhancement results. As a low-level underwater image enhancement solution, our model preserves ample computational resources for other tasks and coexists harmoniously with high-level vision models.

TABLE II: Comparisons of processing speed and parameters

Method	FLOPs	Parameters	Time/Frame
MLLE	/	/	0.163s
WaterNet	143.3G	1.0M	0.475s
UWCNN	5.2G	40.0K	0.005s
U-Trans	60.2G	22.8M	0.08s
Ours	2.8G	176K	0.01s

IV-D Real-world Test

To test the performance of our model on real-world robots, we conduct a real-time underwater image enhancement experiment on our underwater vision-driven ROV. Due to the limitation of processing speed of different models, we only compare the performance of UWCNN and our model that can handle real-time enhancement tasks. In absence of ground truth, PSNR and SSIM are not suitable for the assessment of real-world underwater image enhancement. Instead, we choose UCIQE [26] as the alternative metric, which combines chroma, saturation, and contrast for image quantification. Commonly, higher UCIQE value indicates better image quality.

Real-world experiments shown in Fig. 7 prove the ability of LU2Net to provide high-quality enhanced underwater videos. The output video of 30 fps by the camera is immediately enhanced without noticeable delay, showcasing the potential of our model for real-time marine vision tasks.

From the illustrated experiment results, our model achieves a significant visual effect and outputs balanced and detailed images from a subjective aspect. Moreover, higher UCIQE values further prove the significant performance of LU2Net. With our model, human operators can clearly identify underwater objects and environments through well-enhanced videos and make proper decisions for better robotic safety and performance.

V CONCLUSIONS

In this paper, we proposed a lightweight underwater image enhancement model that demonstrated superior capabilities in effectively removing image distortion and providing real-time enhancement. Our model combined axial depthwise convolution, the channel attention module, and the U-Shape net structure, resulting in the lightweight net structure and significantly fast processing speed. The experiments conducted on the dataset and real-world robots confirmed the model’s low computational requirements and efficient resource utilization, which enabled the proposed model to handle real-time image and video enhancement. This advantage enhanced our model’s suitability for integration into underwater robots for real-time vision tasks.

References

[1] D. R. Yoerger, A. F. Govindarajan, J. C. Howland, J. K. Llopiz, P. H. Wiebe, M. Curran, J. Fujii, D. Gomez-Ibanez, K. Katija, B. H. Robison et al., “A hybrid underwater robot for multidisciplinary investigation of the ocean twilight zone,” Science Robotics, vol. 6, no. 55, p. eabe1901, 2021.
[2] B. Yu, J. Wu, and M. J. Islam, “Udepth: Fast monocular depth estimation for visually-guided underwater robots,” in Proceedings of IEEE International Conference on Robotics and Automation, 2023, pp. 3116–3123.
[3] A. Bucci, L. Zacchini, M. Franchi, A. Ridolfi, and B. Allotta, “Comparison of feature detection and outlier removal strategies in a mono visual odometry algorithm for underwater navigation,” Applied Ocean Research, vol. 118, p. 102961, 2022.
[4] J. S. Jaffe, “Computer modeling and the design of optimal underwater imaging systems,” IEEE Journal of Oceanic Engineering, vol. 15, no. 2, pp. 101–111, 1990.
[5] H. S. Lee, S. W. Moon, and I. K. Eom, “Underwater image enhancement using successive color correction and superpixel dark channel prior,” Symmetry, vol. 12, no. 8, 2020.
[6] S.-B. Gao, M. Zhang, Q. Zhao, X.-S. Zhang, and Y.-J. Li, “Underwater image enhancement using adaptive retinal mechanisms,” IEEE Transactions on Image Processing, vol. 28, no. 11, pp. 5580–5595, 2019.
[7] J. Yuan, W. Cao, Z. Cai, and B. Su, “An underwater image vision enhancement algorithm based on contour bougie morphology,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 10, pp. 8117–8128, 2020.
[8] S. Zhang, T. Wang, J. Dong, and H. Yu, “Underwater image enhancement via extended multi-scale retinex,” Neurocomputing, vol. 245, pp. 1–9, 2017.
[9] W. Zhang, P. Zhuang, H.-H. Sun, G. Li, S. Kwong, and C. Li, “Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement,” IEEE Transactions on Image Processing, vol. 31, pp. 3997–4010, 2022.
[10] M. Yang, A. Sowmya, Z. Wei, and B. Zheng, “Offshore underwater image restoration using reflection-decomposition-based transmission map estimation,” IEEE Journal of Oceanic Engineering, vol. 45, no. 2, pp. 521–533, 2020.
[11] P. Drews, E. Nascimento, F. Moraes, S. Botelho, and M. Campos, “Transmission estimation in underwater single images,” in Proceedings of the IEEE International Conference on Computer Vision Workshops, 2013, pp. 825–830.
[12] A. Galdran, D. Pardo, A. Picón, and A. Alvarez-Gila, “Automatic red-channel underwater image restoration,” Journal of Visual Communication and Image Representation, vol. 26, pp. 132–145, 2015.
[13] Y. Wang, J. Guo, H. Gao, and H. Yue, “Uiec^ 2-net: Cnn-based underwater image enhancement using two color space,” Signal Processing: Image Communication, vol. 96, p. 116250, 2021.
[14] R. Cong, W. Yang, W. Zhang, C. Li, C.-L. Guo, Q. Huang, and S. Kwong, “Pugan: Physical model-guided underwater image enhancement using gan with dual-discriminators,” IEEE Transactions on Image Processing, vol. 32, pp. 4472–4485, 2023.
[15] L. Peng, C. Zhu, and L. Bian, “U-shape transformer for underwater image enhancement,” IEEE Transactions on Image Processing, 2023.
[16] M. S. Hitam, E. A. Awalludin, W. N. J. H. W. Yussof, and Z. Bachok, “Mixture contrast limited adaptive histogram equalization for underwater image enhancement,” in Proceedings of International Conference on Computer Applications Technology, 2013, pp. 1–5.
[17] C. Ancuti, C. O. Ancuti, T. Haber, and P. Bekaert, “Enhancing underwater images and videos by fusion,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 81–88.
[18] C. O. Ancuti, C. Ancuti, C. De Vleeschouwer, and P. Bekaert, “Color balance and fusion for underwater image enhancement,” IEEE Transactions on Image Processing, vol. 27, no. 1, pp. 379–393, 2017.
[19] C. Li, S. Anwar, and F. Porikli, “Underwater scene prior inspired deep underwater image and video enhancement,” Pattern Recognition, vol. 98, p. 107038, 2020.
[20] J. Li, K. A. Skinner, R. M. Eustice, and M. Johnson-Roberson, “Watergan: Unsupervised generative network to enable real-time color correction of monocular underwater images,” IEEE Robotics and Automation letters, vol. 3, no. 1, pp. 387–394, 2017.
[21] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proceedings of Medical Image Computing and Computer-Assisted Intervention. Springer, 2015, pp. 234–241.
[22] J. M. J. Valanarasu and V. M. Patel, “Unext: Mlp-based rapid medical image segmentation network,” in Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2022, pp. 23–33.
[23] B.-D. Dinh, T.-T. Nguyen, T.-T. Tran, and V.-T. Pham, “1m parameters are enough? a lightweight cnn-based model for medical image segmentation,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023, pp. 1279–1284.
[24] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.
[25] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 2017.
[26] M. Yang and A. Sowmya, “An underwater color image quality evaluation metric,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 6062–6071, 2015.

LU2Net: A Lightweight Network for Real-time Underwater Image Enhancement

Abstract

I INTRODUCTION

II Related Works

II-A UIE Methods

II-B Light Weight U-Net

III METHODOLOGY

III-A U-Shape Network Architecture

III-B Axial Depthwise Convolution Module

III-C Channel Attention Module

III-D Loss Function Design

IV EXPERIMENTS

IV-A Experimental Settings

IV-B Training Details

IV-C Model Evaluation

IV-C1 Enhancement Performance Evaluation

IV-C2 Model Speed and Parameters Comparison

IV-D Real-world Test

V CONCLUSIONS

References

LU2Net: A Lightweight Network for Real-time
Underwater Image Enhancement