-
Notifications
You must be signed in to change notification settings - Fork 45
feat: move code to prepare for observations #433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new structure is clearer, but there are a few imports that were missed
|
||
from anemoi.datasets.create.sources.mars import mars | ||
from anemoi.datasets.create.utils import to_datetime_list | ||
from anemoi.datasets.build.utils import to_datetime_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: from anemoi.datasets.build.gridded.utils import to_datetime_list
in accumulations.py
, but this is also valid (we have duplicated code). We should remove the duplicated code, but let's do that in another PR.
|
||
# from .mars import execute as mars | ||
from anemoi.datasets.create.mars import execute as mars | ||
from anemoi.datasets.build.mars import execute as mars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this in the legacy sources refactor, but this should be
from .mars import mars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But I think this source might actually be broken – I need to test this. If it is, we can fix in another PR to keep this one simple)
from anemoi.datasets.build.gridded.result import Result | ||
from anemoi.datasets.build.input.action import Action | ||
from anemoi.datasets.build.input.action import action_factory | ||
from anemoi.datasets.build.input.join import JoinResult |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might have been broken before (so fixing is outside of the scope of this PR), but JoinResult
doesn't exist, neither does anemoi.datasets.build.input.join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #444 to cover this
dataset = dataset._store.path | ||
|
||
from ..stores import zarr_lookup | ||
from anemoi.datasets.use.gridded.stores import zarr_lookup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longer term maybe we move this up a level? (i.e. anemoi.datasets.use.stores.zarr_lookup
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe as part of the main obs dataset development, rather than in this PR
last_window_end = int(end.strftime("%Y%m%d%H%M%S")) | ||
|
||
from .legacy_obs_dataset import ObsDataset | ||
from anemoi.datasets.use.gridded.observations.legacy_obs_dataset import ObsDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from anemoi.datasets.use.gridded.observations.legacy_obs_dataset import ObsDataset | |
from anemoi.datasets.use.tabular.observations.legacy_obs_dataset import ObsDataset |
from anemoi.datasets.data.misc import as_first_date | ||
from anemoi.datasets.data.misc import as_last_date | ||
from anemoi.datasets.use.gridded.misc import as_first_date | ||
from anemoi.datasets.use.gridded.misc import as_last_date |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, these can probably be moved to anemoi.datasets.use.misc
, or just anemoi.datasets.misc
, but can be done in another PR
|
||
def write_metadata(self, metadata): | ||
from anemoi.datasets.create import json_tidy | ||
from anemoi.datasets.build.gridded import json_tidy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can be moved to misc in the future?
print(f"Converting dataset to {output} using new backend '{backend}'") | ||
|
||
from anemoi.datasets.data.records.backends import writer_backend_factory | ||
from anemoi.datasets.use.gridded.records.backends import writer_backend_factory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from anemoi.datasets.use.gridded.records.backends import writer_backend_factory | |
from anemoi.datasets.use.tabular.records.backends import writer_backend_factory |
from .source import Source | ||
from .sources import source_registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from .source import Source | |
from .sources import source_registry | |
from ..source import Source | |
from ..sources import source_registry |
If we want to move to relative imports here
from anemoi.datasets.create.check import DatasetName | ||
from anemoi.datasets.build.gridded.check import DatasetName | ||
|
||
from . import Command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GitHub won't let me comment on the correct line, but line 93 should be
from anemoi.datasets.misc.check import check_zarr
Description
This is a PR to prepare for the observations support. This PR merely move files around, so a lot of files are affected, but there should not be any code change. This prepares for further PRs that may affect the code more. This should be kinder on reviewers.
Naming conventions proposed here:
Move from "create" -> "build"
Introduce two types of datasets to be built:
gridded & tabular
What problem does this change solve?
What issue or task does this change relate to?
Additional notes
As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/
By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.