Torchhydro

License: BSD license
Documentation: https://OuyangWenyu.github.io/torchhydro

torchhydro provides datasets and models for applying deep learning to hydrological modeling.

Installation

For Users

You can install torchhydro using pip or uv (which is faster).

pip install torchhydro

or

uv pip install torchhydro

For Developers

If you want to contribute to the project, we recommend using uv for environment management.

# Clone the repository
git clone https://github.com/OuyangWenyu/torchhydro.git
cd torchhydro

# Create a virtual environment and install all dependencies
uv sync --all-extras

Usage

1. Configure Data Path

Before running any examples, you need to tell torchhydro where your data is located.

Create a file named hydro_setting.yml in your user home directory (C:\Users\YourUsername on Windows or ~/ on Linux/macOS). Then, add the following content, pointing to your data folders:

local_data_path:
  root: 'D:/data/waterism' # Update with your root data directory
  datasets-origin: 'D:/data/waterism/datasets-origin'
  datasets-interim: 'D:/data/waterism/datasets-interim'
  cache: 'D:/data/waterism/cache'

The examples use the CAMELS dataset. If you don't have it, torchhydro will automatically call hydrodataset to download it.

2. Run Examples

We provide standalone scripts in the examples/ directory to help you get started.

examples/lstm_camels_example.py: A basic example of training a standard LSTM model on the CAMELS dataset.
examples/dpl_xaj_example.py: An advanced example of training a differentiable model based on the Xinanjiang (XAJ) hydrological model.

To run an example:

python examples/lstm_camels_example.py

Feel free to modify these scripts to experiment with different models, datasets, and parameters.

Benchmark Results

We provide benchmark results on the CAMELS-US dataset to demonstrate the performance of our models. The figure below shows the NSE (Nash-Sutcliffe Efficiency) distribution comparison between two CAMELS data processing approaches:

Figure: NSE comparison of LSTM model trained on CAMELS-US dataset. The left box plot (red) shows results using hydrodataset.Camels class for data processing, while the right box plot (yellow-green) shows results using hydrodataset.CamelsUS class. Both approaches achieve a median NSE of 0.72, demonstrating robust and consistent performance across 671 basins.

For detailed training results, methodology, and comprehensive performance analysis, see:

LSTM Training Results on CAMELS-US: Complete experiment documentation including model configuration, training progress, and basin-level performance statistics.

Explore More Features

The examples above cover two primary use cases, but torchhydro is much more flexible. We support a variety of models, datasets, and data sources out of the box. Explore the full public API to see all available components:

Models API: Discover all available model architectures.
Datasets API: See all dataset classes, data sources, and samplers.
Trainers API: Understand the core training and evaluation pipeline.

We are continuously working to expand the documentation with more examples.

Main Modules

The project is organized into several key modules:

Trainers: Manages the end-to-end training and evaluation pipeline. The core DeepHydro class handles data loading, model initialization, training loops, and evaluation. It is designed to be extensible for various learning paradigms like transfer learning or multi-task learning.
Models: Contains all available model architectures, including standard neural networks (e.g., LSTM) and differentiable models. A central dictionary allows for easy configuration and selection of models and loss functions.
Datasets: Provides data handling capabilities. It interfaces with data source libraries like hydrodataset (for public datasets like CAMELS) and hydrodatasource (for custom data) to create torch.utils.data.Dataset objects suitable for training.
Configs: Manages all experiment configurations, including settings for the model, data (time periods, variables), training (epochs, batch size), and evaluation.

Why Torchhydro?

While mature tools like NeuralHydrology exist, torchhydro was developed with a different architectural philosophy:

Decoupled Data Sources: We believe data handling, especially for complex or private datasets, requires a separate abstraction layer. Our approach uses hydrodataset and hydrodatasource to manage data access first, which then feeds into a PyTorch Dataset. This modularity promotes code reuse and allows the data source tools to be used even without a deep learning model.
Flexible Learning Paradigms: The framework is explicitly designed to support not just standard supervised learning, but also more complex modes like transfer learning, multi-task learning, and federated learning from the ground up.
Deep Configuration: We provide fine-grained control over many aspects of the pipeline, including data traversal, normalization methods, batch sampling strategies, and advanced dropout techniques, allowing for greater flexibility in experimentation.
Extensibility: The core design principle is to externalize as much configuration as possible, enabling flexible matching and calling of different data sources and models.

Additional Information

This package was inspired by:

This package was created using the Cookiecutter and the giswqs/pypackage project template.

Name		Name	Last commit message	Last commit date
Latest commit History 535 Commits
.github		.github
data		data
docs		docs
examples		examples
experiments		experiments
tests		tests
torchhydro		torchhydro
.editorconfig		.editorconfig
.gitignore		.gitignore
.python-version		.python-version
AUTHORS.rst		AUTHORS.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README.zh.md		README.zh.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Torchhydro

Installation

For Users

For Developers

Usage

1. Configure Data Path

2. Run Examples

Benchmark Results

Explore More Features

Main Modules

Why Torchhydro?

Additional Information

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors 15

Uh oh!

Languages

License

OuyangWenyu/torchhydro

Folders and files

Latest commit

History

Repository files navigation

Torchhydro

Installation

For Users

For Developers

Usage

1. Configure Data Path

2. Run Examples

Benchmark Results

Explore More Features

Main Modules

Why Torchhydro?

Additional Information

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors 15

Uh oh!

Languages

Packages