Skip to content

BRAIN-Lab-AI/AquaDiff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AquaDiff 🌊

Afrah Shaahid, Muzammil Behzad
King Fahd University of Petroleum and Minerals · SDAIA-KFUPM Joint Research Center for Artificial Intelligence

Project Page arXiv GitHub License: MIT

AquaDiff is a diffusion-based underwater image enhancement framework designed to correct wavelength-dependent color distortion while preserving structural and perceptual fidelity. By integrating chromatic prior-guided color compensation with a conditional diffusion process and cross-attention mechanisms, AquaDiff achieves superior color correction and detail recovery across diverse underwater conditions.

Underwater images suffer from severe degradation due to wavelength-dependent light absorption and scattering, resulting in color distortion, low contrast, and loss of fine details. These artifacts significantly impair vision-based underwater applications including object detection, navigation, and 3D reconstruction. While diffusion models have shown promise in low-level vision tasks, existing approaches lack explicit mechanisms to address underwater-specific degradations.

AquaDiff addresses these challenges through a novel framework that combines (i) chromatic prior-guided color compensation preprocessing, (ii) cross-attention conditioning for dynamic feature fusion, (iii) an enhanced denoising backbone with residual dense blocks and multi-resolution attention, and (iv) a cross-domain consistency loss that jointly enforces pixel-level accuracy, perceptual similarity, structural integrity, and frequency-domain fidelity.

Here is an overview of the AquaDiff framework: AquaDiff Architecture

Overview of AquaDiff

The AquaDiff framework operates through a two-stage diffusion mechanism:

  1. Forward Diffusion Process: Progressively corrupts clean reference images with Gaussian noise through a Markov chain, ultimately transforming them into pure noise samples.

  2. Reverse Diffusion Process: Learns to iteratively denoise and reconstruct clean images from noisy states, conditioned on color-compensated degraded inputs via cross-attention fusion at each timestep.

Key Components

  • Chromatic Prior-Guided Color Compensation: Preprocesses degraded underwater images using 3-channel compensation (3C) in Lab color space to mitigate color distortion before diffusion conditioning. Implemented in model/ColorChannelCompensation.py.

  • Cross-Attention Conditioning: Dynamically fuses noisy intermediate states with color-compensated conditioning images at each denoising step, enabling adaptive feature weighting based on noise levels.

  • Enhanced Denoising Backbone: Employs a U-Net architecture with residual dense blocks, dense skip connections, and multi-resolution attention modules to capture both global color context and local structural details. Architecture defined in model/networks.py and model/model.py.

  • Cross-Domain Consistency Loss: A novel hybrid loss function that jointly enforces pixel-level accuracy, perceptual similarity, structural integrity, and frequency-domain fidelity. Implemented in model/losses.py and model/perceptual_loss.py.

Quick Start

Step 1: Clone the Repository

git clone https://github.com/BRAIN-Lab-AI/AquaDiff.git
cd AquaDiff

Step 2: Set Up Environment

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirement.txt

Step 3: Prepare Datasets

Download the following datasets and organize them in the data/ directory:

  • LSUI Dataset: 5,004 paired underwater images
  • UIEB Dataset: 890 paired underwater images (800 for training, 90 for testing)
  • Test Datasets: U45, S16, C60, and SQUID for evaluation

The data loading and preprocessing are handled by:

  • data/dataset.py: Custom dataset classes for loading paired underwater images with various preprocessing options
  • data/util.py: Utility functions for data augmentation, image transformations, and dataset-specific operations

Step 4: Configuration

Edit the config/config.yaml file with your model settings:

model:
  diffusion_steps: 2000
  base_channels: 64
  channel_multipliers: [1, 2, 4, 8, 16]
  cross_attention: true
  multi_res_attention: true
  
training:
  batch_size: 1
  learning_rate: 3e-6
  iterations: 1000000
  loss_weights:
    l1: 1.0
    perceptual: 0.1
    ssim: 0.1
    frequency: 0.05
  
data:
  image_size: 256
  datasets: ["LSUI", "UIEB"]
  augmentation: true
  normalization: "imagenet"

Launch AquaDiff

Inference on Single Image

python test.py --input path/to/image.jpg --output results/

Batch Processing

python test.py --input_dir data/test_images/ --output_dir results/ --timesteps 2000

Training from Scratch

python train.py --config config/config.yaml --gpu 0

Evaluation

python eval.py --test_datasets U45 S16 C60 --metrics UIQM UCIQE

Project Structure

├── config/
│   └── config.yaml
├── core/
├── data/
│   ├── __pycache__/
│   ├── __init__.py
│   ├── dataset.py          # Dataset classes for loading underwater image pairs
│   └── util.py             # Data utilities and augmentation functions
├── model/
│   ├── __pycache__/
│   ├── ddpm_modules/
│   ├── __init__.py
│   ├── base_model.py
│   ├── ColorChannelCompensation.py
│   ├── losses.py
│   ├── model.py
│   ├── networks.py
│   └── perceptual_loss.py
├── static/
│   └── images/
│       ├── AquaDiffArchitecture.png
│       ├── AquaDiffradarbars.png
│       ├── C60.png
│       ├── S16.png
│       ├── U45.png
│       ├── U90.png
│       └── results_table.png
├── LICENSE
├── README.md
├── requirement.txt
├── eval.py
├── test.py
└── train.py

Dataset Module Details

data/dataset.py

This module provides custom PyTorch Dataset classes for underwater image enhancement:

  • PairedUnderwaterDataset: Loads pairs of degraded and reference underwater images

    • Supports multiple dataset formats (LSUI, UIEB, custom)
    • Configurable image cropping and resizing
    • On-the-fly preprocessing options
    • Integration with color compensation preprocessing
  • TestDataset: Specialized dataset for evaluation on benchmark datasets (U45, S16, C60)

    • Handles unpaired test scenarios
    • Supports no-reference quality assessment

data/util.py

Utility functions for data processing:

  • Image transformations and augmentations
    • Random horizontal/vertical flips
    • Color jittering
    • Rotation and cropping
  • Normalization utilities (ImageNet stats, custom underwater stats)
  • Batch collation functions
  • Dataset-specific preprocessing pipelines

Key Features

Chromatic Prior-Guided Color Compensation

  • 3-Channel Compensation (3C): Preprocesses degraded images in Lab color space to correct color casts before diffusion conditioning (ColorChannelCompensation.py)
  • Spatially-Varying Mask: Prevents overcompensation in high-brightness regions
  • Physics-Informed: Based on underwater image formation model accounting for wavelength-dependent attenuation

Cross-Attention Conditioning

  • Dynamic Feature Fusion: Adaptively weights conditioning information based on noise levels at each timestep
  • Timestep-Aware Guidance: Enables model to leverage structural and chromatic cues progressively during denoising
  • Improved Color Correction: Cross-attention formulation enables effective fusion of degraded and noisy representations

Enhanced Denoising Architecture

  • Residual Dense Blocks: Incorporate dense connections within each block for richer feature extraction (networks.py)
  • Multi-Resolution Attention: Establishes long-range dependencies at multiple resolutions for global color consistency
  • Dense Skip Connections: Facilitate better feature fusion between encoder and decoder pathways

Cross-Domain Consistency Loss

  • Pixel-Level Accuracy: ℓ1 reconstruction loss for fundamental fidelity (losses.py)
  • Perceptual Similarity: VGG19 feature matching for semantic consistency (perceptual_loss.py)
  • Structural Integrity: SSIM-based loss for luminance, contrast, and structure preservation
  • Frequency-Domain Fidelity: FFT magnitude spectrum matching for high-frequency detail recovery

Results

Quantitative Comparison

AquaDiff achieves state-of-the-art performance across multiple underwater image enhancement benchmarks:

Results Table

Key Observations:

  • Highest UCIQE scores across all datasets (0.539 on U45, 0.524 on S16, 0.518 on C60) demonstrating superior color correction and contrast restoration
  • Competitive UIQM scores indicating excellent preservation of colorfulness, sharpness, and contrast

Radar Chart Performance

AquaDiff Radar Bars

The radar chart illustrates AquaDiff's balanced performance across multiple quality metrics, demonstrating consistent superiority in color correction (UCIQE) while maintaining competitive performance in other perceptual measures.

Qualitative Results

Comparison on U90 Dataset

U90 Comparison

AquaDiff demonstrates significant improvements in color restoration and visual clarity across diverse underwater scenes:

  • Effective correction of severe color casts (blue-green, reddish-brown)
  • Preservation of fine textures and intricate details
  • Superior handling of scenes with artificial lighting
  • Successful recovery of shadow details

Comparison on U45 Dataset

U45 Comparison

Our method excels in:

  • Transforming green-yellow dominant sandy seabeds and diver scenes
  • Removing deep blue saturation from marine life imagery
  • Correcting reddish-brown distortions in coral reefs
  • Enhancing visibility in complex scenes with shipwrecks and intricate coral formations

Comparison on C60 Dataset

C60 Comparison

For severely degraded images where reference images are unavailable:

  • Accurate restoration of natural colors in blue-green dominant scenes
  • Effective neutralization of green-yellow turbidity
  • Consistent removal of reddish-brown casts without oversaturation
  • Superior haze reduction with natural appearance

Comparison on S16 Dataset

S16 Comparison

Particularly noteworthy is performance on scenes with color calibration charts:

  • Accurate restoration of distinct color squares (reds, blues, greens, yellows)
  • Substantial haze reduction across all scenes
  • Recovery of fine details including seabed textures, diver equipment, and coral patterns

Ablation Studies

Model Variant UIQM ↑ UCIQE ↑
Baseline Diffusion Model 4.12 0.486
+ CDCL Only 4.38 0.521
+ Enhanced U-Net Only 4.45 0.528
AquaDiff (Full Model) 4.61 0.539

Key Findings:

  • Cross-Domain Consistency Loss (CDCL) improves both UIQM (+0.26) and UCIQE (+0.035), indicating better color correction and artifact suppression
  • Enhanced U-Net backbone with residual dense blocks and multi-resolution attention yields significant gains (+0.33 UIQM, +0.042 UCIQE)
  • Full model achieves best performance, demonstrating synergy between architectural enhancements and multi-domain loss constraints

Model Details

Core Components

  • model/base_model.py: Base class defining common model interfaces and utilities
  • model/model.py: Main AquaDiff model implementation with diffusion pipeline
  • model/networks.py: Neural network architectures including U-Net with attention mechanisms
  • model/ddpm_modules/: Denoising Diffusion Probabilistic Model components
  • model/ColorChannelCompensation.py: 3C preprocessing for color correction
  • model/losses.py: Cross-domain consistency loss implementation
  • model/perceptual_loss.py: Perceptual loss using pre-trained VGG features

Data Module

  • data/dataset.py: Custom dataset classes for loading and preprocessing underwater image pairs
  • data/util.py: Utility functions for data augmentation, normalization, and transformations

Training and Evaluation

  • train.py: Main training script with configurable hyperparameters
  • eval.py: Evaluation script computing UIQM, UCIQE, and other metrics
  • test.py: Inference script for single image or batch processing

Community Contributions

We welcome contributions from the community! Here are some ways you can help:

  • Report bugs: Open an issue if you encounter any problems
  • Suggest improvements: Share ideas for enhancing the model or codebase
  • Add features: Submit pull requests for new functionality
  • Share results: Showcase AquaDiff applications in your research

We are particularly interested in:

  • Extension to underwater video enhancement
  • Integration with underwater robotics platforms (ROVs/AUVs)
  • Adaptation for specific environments (coral reefs, deep sea, turbid waters)
  • Lightweight versions for edge deployment
  • Applications in marine biodiversity monitoring and survey systems

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you find AquaDiff helpful for your research, please cite our paper:

@article{shaahid2025aquadiff,
  title={AquaDiff: Diffusion-Based Underwater Image Enhancement with Chromatic Prior Guidance and Cross-Domain Consistency},
  author={Shaahid, Afrah and Behzad, Muzammil},
  journal={arXiv preprint arXiv:2512.14760},
  year={2025}
}

Acknowledgements

This research received funding from King Fahd University of Petroleum and Minerals and the SDAIA-KFUPM Joint Research Center for Artificial Intelligence. We thank the developers of the diffusion models and the authors of the LSUI and UIEB datasets for making their work publicly available.

Project Page

Visit our project website for more details, visual results, and updates.

Contact

For questions or collaborations, please contact:


⭐ If you find AquaDiff useful, please consider starring the repository! ⭐

About

Aquadiff: Cross-Domain Consistency-Guided Diffusion Model for Underwater Image Enhancement

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors