AquaDiff is a diffusion-based underwater image enhancement framework designed to correct wavelength-dependent color distortion while preserving structural and perceptual fidelity. By integrating chromatic prior-guided color compensation with a conditional diffusion process and cross-attention mechanisms, AquaDiff achieves superior color correction and detail recovery across diverse underwater conditions.
Underwater images suffer from severe degradation due to wavelength-dependent light absorption and scattering, resulting in color distortion, low contrast, and loss of fine details. These artifacts significantly impair vision-based underwater applications including object detection, navigation, and 3D reconstruction. While diffusion models have shown promise in low-level vision tasks, existing approaches lack explicit mechanisms to address underwater-specific degradations.
AquaDiff addresses these challenges through a novel framework that combines (i) chromatic prior-guided color compensation preprocessing, (ii) cross-attention conditioning for dynamic feature fusion, (iii) an enhanced denoising backbone with residual dense blocks and multi-resolution attention, and (iv) a cross-domain consistency loss that jointly enforces pixel-level accuracy, perceptual similarity, structural integrity, and frequency-domain fidelity.
Here is an overview of the AquaDiff framework:

The AquaDiff framework operates through a two-stage diffusion mechanism:
-
Forward Diffusion Process: Progressively corrupts clean reference images with Gaussian noise through a Markov chain, ultimately transforming them into pure noise samples.
-
Reverse Diffusion Process: Learns to iteratively denoise and reconstruct clean images from noisy states, conditioned on color-compensated degraded inputs via cross-attention fusion at each timestep.
-
Chromatic Prior-Guided Color Compensation: Preprocesses degraded underwater images using 3-channel compensation (3C) in Lab color space to mitigate color distortion before diffusion conditioning. Implemented in
model/ColorChannelCompensation.py. -
Cross-Attention Conditioning: Dynamically fuses noisy intermediate states with color-compensated conditioning images at each denoising step, enabling adaptive feature weighting based on noise levels.
-
Enhanced Denoising Backbone: Employs a U-Net architecture with residual dense blocks, dense skip connections, and multi-resolution attention modules to capture both global color context and local structural details. Architecture defined in
model/networks.pyandmodel/model.py. -
Cross-Domain Consistency Loss: A novel hybrid loss function that jointly enforces pixel-level accuracy, perceptual similarity, structural integrity, and frequency-domain fidelity. Implemented in
model/losses.pyandmodel/perceptual_loss.py.
git clone https://github.com/BRAIN-Lab-AI/AquaDiff.git
cd AquaDiff# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirement.txtDownload the following datasets and organize them in the data/ directory:
- LSUI Dataset: 5,004 paired underwater images
- UIEB Dataset: 890 paired underwater images (800 for training, 90 for testing)
- Test Datasets: U45, S16, C60, and SQUID for evaluation
The data loading and preprocessing are handled by:
data/dataset.py: Custom dataset classes for loading paired underwater images with various preprocessing optionsdata/util.py: Utility functions for data augmentation, image transformations, and dataset-specific operations
Edit the config/config.yaml file with your model settings:
model:
diffusion_steps: 2000
base_channels: 64
channel_multipliers: [1, 2, 4, 8, 16]
cross_attention: true
multi_res_attention: true
training:
batch_size: 1
learning_rate: 3e-6
iterations: 1000000
loss_weights:
l1: 1.0
perceptual: 0.1
ssim: 0.1
frequency: 0.05
data:
image_size: 256
datasets: ["LSUI", "UIEB"]
augmentation: true
normalization: "imagenet"python test.py --input path/to/image.jpg --output results/python test.py --input_dir data/test_images/ --output_dir results/ --timesteps 2000python train.py --config config/config.yaml --gpu 0python eval.py --test_datasets U45 S16 C60 --metrics UIQM UCIQE├── config/
│ └── config.yaml
├── core/
├── data/
│ ├── __pycache__/
│ ├── __init__.py
│ ├── dataset.py # Dataset classes for loading underwater image pairs
│ └── util.py # Data utilities and augmentation functions
├── model/
│ ├── __pycache__/
│ ├── ddpm_modules/
│ ├── __init__.py
│ ├── base_model.py
│ ├── ColorChannelCompensation.py
│ ├── losses.py
│ ├── model.py
│ ├── networks.py
│ └── perceptual_loss.py
├── static/
│ └── images/
│ ├── AquaDiffArchitecture.png
│ ├── AquaDiffradarbars.png
│ ├── C60.png
│ ├── S16.png
│ ├── U45.png
│ ├── U90.png
│ └── results_table.png
├── LICENSE
├── README.md
├── requirement.txt
├── eval.py
├── test.py
└── train.py
This module provides custom PyTorch Dataset classes for underwater image enhancement:
-
PairedUnderwaterDataset: Loads pairs of degraded and reference underwater images
- Supports multiple dataset formats (LSUI, UIEB, custom)
- Configurable image cropping and resizing
- On-the-fly preprocessing options
- Integration with color compensation preprocessing
-
TestDataset: Specialized dataset for evaluation on benchmark datasets (U45, S16, C60)
- Handles unpaired test scenarios
- Supports no-reference quality assessment
Utility functions for data processing:
- Image transformations and augmentations
- Random horizontal/vertical flips
- Color jittering
- Rotation and cropping
- Normalization utilities (ImageNet stats, custom underwater stats)
- Batch collation functions
- Dataset-specific preprocessing pipelines
- 3-Channel Compensation (3C): Preprocesses degraded images in Lab color space to correct color casts before diffusion conditioning (
ColorChannelCompensation.py) - Spatially-Varying Mask: Prevents overcompensation in high-brightness regions
- Physics-Informed: Based on underwater image formation model accounting for wavelength-dependent attenuation
- Dynamic Feature Fusion: Adaptively weights conditioning information based on noise levels at each timestep
- Timestep-Aware Guidance: Enables model to leverage structural and chromatic cues progressively during denoising
- Improved Color Correction: Cross-attention formulation enables effective fusion of degraded and noisy representations
- Residual Dense Blocks: Incorporate dense connections within each block for richer feature extraction (
networks.py) - Multi-Resolution Attention: Establishes long-range dependencies at multiple resolutions for global color consistency
- Dense Skip Connections: Facilitate better feature fusion between encoder and decoder pathways
- Pixel-Level Accuracy: ℓ1 reconstruction loss for fundamental fidelity (
losses.py) - Perceptual Similarity: VGG19 feature matching for semantic consistency (
perceptual_loss.py) - Structural Integrity: SSIM-based loss for luminance, contrast, and structure preservation
- Frequency-Domain Fidelity: FFT magnitude spectrum matching for high-frequency detail recovery
AquaDiff achieves state-of-the-art performance across multiple underwater image enhancement benchmarks:
Key Observations:
- Highest UCIQE scores across all datasets (0.539 on U45, 0.524 on S16, 0.518 on C60) demonstrating superior color correction and contrast restoration
- Competitive UIQM scores indicating excellent preservation of colorfulness, sharpness, and contrast
The radar chart illustrates AquaDiff's balanced performance across multiple quality metrics, demonstrating consistent superiority in color correction (UCIQE) while maintaining competitive performance in other perceptual measures.
AquaDiff demonstrates significant improvements in color restoration and visual clarity across diverse underwater scenes:
- Effective correction of severe color casts (blue-green, reddish-brown)
- Preservation of fine textures and intricate details
- Superior handling of scenes with artificial lighting
- Successful recovery of shadow details
Our method excels in:
- Transforming green-yellow dominant sandy seabeds and diver scenes
- Removing deep blue saturation from marine life imagery
- Correcting reddish-brown distortions in coral reefs
- Enhancing visibility in complex scenes with shipwrecks and intricate coral formations
For severely degraded images where reference images are unavailable:
- Accurate restoration of natural colors in blue-green dominant scenes
- Effective neutralization of green-yellow turbidity
- Consistent removal of reddish-brown casts without oversaturation
- Superior haze reduction with natural appearance
Particularly noteworthy is performance on scenes with color calibration charts:
- Accurate restoration of distinct color squares (reds, blues, greens, yellows)
- Substantial haze reduction across all scenes
- Recovery of fine details including seabed textures, diver equipment, and coral patterns
| Model Variant | UIQM ↑ | UCIQE ↑ |
|---|---|---|
| Baseline Diffusion Model | 4.12 | 0.486 |
| + CDCL Only | 4.38 | 0.521 |
| + Enhanced U-Net Only | 4.45 | 0.528 |
| AquaDiff (Full Model) | 4.61 | 0.539 |
Key Findings:
- Cross-Domain Consistency Loss (CDCL) improves both UIQM (+0.26) and UCIQE (+0.035), indicating better color correction and artifact suppression
- Enhanced U-Net backbone with residual dense blocks and multi-resolution attention yields significant gains (+0.33 UIQM, +0.042 UCIQE)
- Full model achieves best performance, demonstrating synergy between architectural enhancements and multi-domain loss constraints
model/base_model.py: Base class defining common model interfaces and utilitiesmodel/model.py: Main AquaDiff model implementation with diffusion pipelinemodel/networks.py: Neural network architectures including U-Net with attention mechanismsmodel/ddpm_modules/: Denoising Diffusion Probabilistic Model componentsmodel/ColorChannelCompensation.py: 3C preprocessing for color correctionmodel/losses.py: Cross-domain consistency loss implementationmodel/perceptual_loss.py: Perceptual loss using pre-trained VGG features
data/dataset.py: Custom dataset classes for loading and preprocessing underwater image pairsdata/util.py: Utility functions for data augmentation, normalization, and transformations
train.py: Main training script with configurable hyperparameterseval.py: Evaluation script computing UIQM, UCIQE, and other metricstest.py: Inference script for single image or batch processing
We welcome contributions from the community! Here are some ways you can help:
- Report bugs: Open an issue if you encounter any problems
- Suggest improvements: Share ideas for enhancing the model or codebase
- Add features: Submit pull requests for new functionality
- Share results: Showcase AquaDiff applications in your research
We are particularly interested in:
- Extension to underwater video enhancement
- Integration with underwater robotics platforms (ROVs/AUVs)
- Adaptation for specific environments (coral reefs, deep sea, turbid waters)
- Lightweight versions for edge deployment
- Applications in marine biodiversity monitoring and survey systems
This project is licensed under the MIT License - see the LICENSE file for details.
If you find AquaDiff helpful for your research, please cite our paper:
@article{shaahid2025aquadiff,
title={AquaDiff: Diffusion-Based Underwater Image Enhancement with Chromatic Prior Guidance and Cross-Domain Consistency},
author={Shaahid, Afrah and Behzad, Muzammil},
journal={arXiv preprint arXiv:2512.14760},
year={2025}
}This research received funding from King Fahd University of Petroleum and Minerals and the SDAIA-KFUPM Joint Research Center for Artificial Intelligence. We thank the developers of the diffusion models and the authors of the LSUI and UIEB datasets for making their work publicly available.
Visit our project website for more details, visual results, and updates.
For questions or collaborations, please contact:
- Afrah Shaahid: afrahshaahid@outlook.com
- Muzammil Behzad: muzammil.behzad@kfupm.edu.sa





