Skip to content

Conversation

b8raoult
Copy link
Collaborator

@b8raoult b8raoult commented Jul 24, 2025

Description

A Zarr store that copy chunk lazily to an other filesystem (ssd based), so that the second epoch of training is faster.

Example, scanning aifs-ea-an-oper-0001-mars-o96-1979-2022-6h-v6 twice:

CopyToSSDStore: using temporary directory /.../ssd1/tmpdirs/.../anemoi-datasets-ssd-prv26ojg
Pass 1: 100%|████████████████████████████████████████████████████████████| 64284/64284 [1:07:32<00:00, 15.86it/s]
Pass 1 took 4052.78 seconds
Pass 2: 100%|██████████████████████████████████████████████████████████████| 64284/64284 [15:16<00:00, 70.15it/s]
Pass 2 took 916.36 seconds
CopyToSSDStore: total size copied: 578.7 GiB
CopyToSSDStore: copied 64,287 objects, reused 64,284 objects

First pass (From disk + copy to SSD) : 1h
Second pass (from SSD): 15 minutes

What problem does this change solve?

What issue or task does this change relate to?

Additional notes

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

@github-project-automation github-project-automation bot moved this to Now In Progress in Anemoi-dev Jul 24, 2025
@github-actions github-actions bot added the enhancement New feature or request label Jul 24, 2025
@github-actions github-actions bot added the tests label Jul 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request tests

Projects

Status: Now In Progress

Development

Successfully merging this pull request may close these issues.

1 participant