Skip to content

Conversation

@MeraX
Copy link
Contributor

@MeraX MeraX commented Nov 10, 2025

Description

This is necessary to group joined repeated-dates with grib when it is pass both trough a MatchingFieldsFilter-

What issue or task does this change relate to?

example config that necessitates this fix.

        - join:
          - grib: # SMI
              path: *initialized_analysis_path
              param: ["W_SO"]
              topLevel:d: [0.0, 0.01, 0.03, 0.09, 0.27, 0.8099999999999999, 2.43, 7.290000000000001]
              grid_definition:
                <<: *grid_definition
              flavour:
                - - {step: "0s"}
                  - {step: 0} # Set setp to 0 (instead of "0s") to align with SOILTYP.

          - repeated-dates:
              mode: constant
              source:
                grib:
                  path: *extpar_path
                  param: ["SOILTYP"]
                  level: 0
                  grid_definition:
                    <<: *grid_definition
        - rename:
            param: "{param}_{topLevel:d}"
        - W_SO_to_SMI:
            soiltyp: "SOILTYP_0.0"
            w_so1: "W_SO_0.0"
            w_so2: "W_SO_0.01"
            w_so3: "W_SO_0.03"
            w_so4: "W_SO_0.09"
            w_so5: "W_SO_0.27"
            w_so6: "W_SO_0.8099999999999999"
            w_so7: "W_SO_2.43"
            w_so8: "W_SO_7.290000000000001"
            layer_accumulation: [{"topLevel:d": 0.0, "bottomLevel:d": 0.03}, {"topLevel:d": 0.09, "bottomLevel:d": 0.8099999999999999}]

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

date = int(valid_datetime.date().strftime("%Y%m%d"))
assert valid_datetime.time().minute == 0, valid_datetime
time = valid_datetime.time().hour
date = int(valid_datetime.strftime("%Y%m%d"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marek what's the current error that you're getting if those changes are not applied when using your recipe with repeated dates? Wondering if this an issue in transforms or that repeated dates is not returning the right valid_datetime format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010118/fc_R03B07_rea_ml.2022010118')
2025-11-11 08:18:34 INFO Registering data at path: ('input', 'pipe', '0', 'join', '0', 'grib')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010103/fc_R03B07_rea_ml.2022010100')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010109/fc_R03B07_rea_ml.2022010106')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010115/fc_R03B07_rea_ml.2022010112')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010121/fc_R03B07_rea_ml.2022010118')
2025-11-11 08:19:47 INFO Registering data at path: ('input', 'pipe', '0', 'join', '1', 'grib')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010103/fc_R03B07_rea_ml.2022010100')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010109/fc_R03B07_rea_ml.2022010106')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010115/fc_R03B07_rea_ml.2022010112')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010121/fc_R03B07_rea_ml.2022010118')
2025-11-11 08:20:42 INFO Registering data at path: ('input', 'pipe', '0', 'join', '2', 'grib')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010103/fc_R03B07_rea_ml.2022010100')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010109/fc_R03B07_rea_ml.2022010106')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010115/fc_R03B07_rea_ml.2022010112')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010121/fc_R03B07_rea_ml.2022010118')
2025-11-11 08:21:41 INFO Registering data at path: ('input', 'pipe', '0', 'join', '3', 'pipe', '0', 'grib')
2025-11-11 08:21:41 INFO Registering data at path: ('input', 'pipe', '0', 'join', '3', 'pipe', '1', 'rename')
2025-11-11 08:21:41 INFO Registering data at path: ('input', 'pipe', '0', 'join', '3', 'pipe')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010103/fc_R03B07_rea_ml.2022010100')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010109/fc_R03B07_rea_ml.2022010106')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010115/fc_R03B07_rea_ml.2022010112')
📁: ('PATH', '/hpc/uwork/fe11rea/ICON-DREAM/2022010121/fc_R03B07_rea_ml.2022010118')
2025-11-11 08:22:51 INFO Registering data at path: ('input', 'pipe', '0', 'join', '4', 'pipe', '0', 'join', '0', 'grib')
one_date_group: GroupOfDates(dates=[]), many_dates_group: GroupOfDates(dates=['2022-01-01T00:00:00', '2022-01-01T06:00:00', '2022-01-01T12:00:00', '2022-01-01T18:00:00'])
📁: ('PATH', '/hpc/rhome/routfor/routfox/icon/grids/public/edzw/icon_extpar_0026_R03B07_G_20231113_tiles.g2')
2025-11-11 08:22:51 WARNING Expected 0 fields, got 1 (kwargs={'valid_datetime': [], 'level': 0, 'param': ['SOILTYP']}, paths=['/hpc/rhome/routfor/routfox/icon/grids/public/edzw/icon_extpar_0026_R03B07_G_20231113_tiles.g2']) Received empty dates - assuming this is static data.
2025-11-11 08:22:51 INFO Registering data at path: ('data_sources', '23200884640800', 'grib')
2025-11-11 08:22:51 INFO Registering data at path: ('input', 'pipe', '0', 'join', '4', 'pipe', '0', 'join', '1', 'repeated-dates')
2025-11-11 08:22:51 INFO Registering data at path: ('input', 'pipe', '0', 'join', '4', 'pipe', '0', 'join')
2025-11-11 08:22:51 INFO Registering data at path: ('input', 'pipe', '0', 'join', '4', 'pipe', '1', 'rename')
2025-11-11 08:22:51 INFO Params groups: {'W_SO_0.27', 'W_SO_0.03', 'W_SO_0.8099999999999999', 'SOILTYP_0.0', 'W_SO_0.0', 'W_SO_7.290000000000001', 'W_SO_2.43', 'W_SO_0.09', 'W_SO_0.01'}
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.0'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.01'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.03'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.09'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.27'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,1,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.8099999999999999'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,2,20220101,0,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_2.43'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,7,20220101,600,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_7.290000000000001'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.0'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.01'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.03'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.09'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.27'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,1,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.8099999999999999'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,2,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_2.43'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,7,20220101,1200,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_7.290000000000001'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.0'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.01'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.03'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.09'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,0,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.27'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,1,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_0.8099999999999999'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,2,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_2.43'}))
NewMetadataField(NewGridField(NewFlavouredField(GribField(W_SO,7,20220101,1800,0s,None), (No specific representation for NewFlavouredField)), <anemoi.transform.grids.icon.IconGrid object at 0x1519e06060f0>), (metadata={'param': 'W_SO_7.290000000000001'}))
NewMetadataField(NewValidDateTimeField(NewGridField(GribField(SOILTYP,None,10101,0,0h,None), <anemoi.transform.grids.icon.IconGrid object at 0x1519e035bf50>), (metadata={'date': 20220101, 'time': 0, 'step': 0, 'valid_datetime': '2022-01-01T00:00:00'})), (metadata={'param': 'SOILTYP_0.0'}))
NewMetadataField(NewValidDateTimeField(NewGridField(GribField(SOILTYP,None,10101,0,0h,None), <anemoi.transform.grids.icon.IconGrid object at 0x1519e035bf50>), (metadata={'date': 20220101, 'time': 6, 'step': 0, 'valid_datetime': '2022-01-01T06:00:00'})), (metadata={'param': 'SOILTYP_0.0'}))
NewMetadataField(NewValidDateTimeField(NewGridField(GribField(SOILTYP,None,10101,0,0h,None), <anemoi.transform.grids.icon.IconGrid object at 0x1519e035bf50>), (metadata={'date': 20220101, 'time': 12, 'step': 0, 'valid_datetime': '2022-01-01T12:00:00'})), (metadata={'param': 'SOILTYP_0.0'}))
NewMetadataField(NewValidDateTimeField(NewGridField(GribField(SOILTYP,None,10101,0,0h,None), <anemoi.transform.grids.icon.IconGrid object at 0x1519e035bf50>), (metadata={'date': 20220101, 'time': 18, 'step': 0, 'valid_datetime': '2022-01-01T18:00:00'})), (metadata={'param': 'SOILTYP_0.0'}))
Traceback (most recent call last):
  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/utils/cli.py", line 266, in cli_main
    cmd.run(args)
  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/commands/create.py", line 102, in run
    self.serial_create(args)
  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/commands/create.py", line 119, in serial_create
    task("load", options)
  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/commands/create.py", line 53, in task
    result = c.run()

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/__init__.py", line 854, in run
    self._run()
  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/__init__.py", line 868, in _run
    result = self.input.select(argument=group)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/__init__.py", line 65, in select
    return context.create_result(self.action(context, argument))

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 268, in __call__
    return self.input(context, argument)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 156, in __call__
    result = action(context, argument)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 120, in __call__
    results += action(context, argument)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 158, in __call__
    result = action(context, result)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 177, in __call__
    return context.register(self.call_object(context, source, argument), self.path)

  File "/shared/data/majacob/datasets/aicon-catalog/venv/lib64/python3.12/site-packages/anemoi/datasets/create/input/action.py", line 210, in call_object
    return filter.forward(context.filter_argument(argument))

  File "/shared/data/majacob/datasets/anemoi-transform/src/anemoi/transform/filters/matching.py", line 291, in forward
    return self._transform(

  File "/shared/data/majacob/datasets/anemoi-transform/src/anemoi/transform/filters/matching.py", line 358, in _transform
    for matching in grouping.iterate(data, other=result.append):

  File "/shared/data/majacob/datasets/anemoi-transform/src/anemoi/transform/grouping/__init__.py", line 116, in iterate
    raise ValueError(f"Missing component. Want {sorted(self.params)}, got {sorted(group.keys())}")
ValueError: Missing component. Want ['SOILTYP_0.0', 'W_SO_0.0', 'W_SO_0.01', 'W_SO_0.03', 'W_SO_0.09', 'W_SO_0.27', 'W_SO_0.8099999999999999', 'W_SO_2.43', 'W_SO_7.290000000000001'], got ['W_SO_0.0', 'W_SO_0.01', 'W_SO_0.03', 'W_SO_0.09', 'W_SO_0.27', 'W_SO_0.8099999999999999', 'W_SO_2.43', 'W_SO_7.290000000000001']
2025-11-11 08:22:52 ERROR
💣 Missing component. Want ['SOILTYP_0.0', 'W_SO_0.0', 'W_SO_0.01', 'W_SO_0.03', 'W_SO_0.09', 'W_SO_0.27', 'W_SO_0.8099999999999999', 'W_SO_2.43', 'W_SO_7.290000000000001'], got ['W_SO_0.0', 'W_SO_0.01', 'W_SO_0.03', 'W_SO_0.09', 'W_SO_0.27', 'W_SO_0.8099999999999999', 'W_SO_2.43', 'W_SO_7.290000000000001']
2025-11-11 08:22:52 ERROR 💣 Exiting

I hope this message helps to understand the situation. A bit of debugging revealed that SOILTYP_0.0 had time=6 but W_SO_0.0 had time=600. Likewise SOILTYP_0.0.step=0, but W_SO_0.0.step="0s".

> /shared/data/majacob/datasets/anemoi-transform/src/anemoi/transform/grouping/__init__.py(116)iterate()
    115
--> 116
    117
ipdb>  
ipdb> data.metadata("time")
[0, 0, 0, 0, 0, 0, 0, 0, 0]
ipdb> data.metadata("step")
['0s', '0s', '0s', '0s', '0s', '0s', '0s', '0s', 0]

The issue then propagates to the variable grouping around anemoi-transform/src/anemoi/transform/filters/matching.py(358) that compares lists of metadata including time and step.

So far, I've already solved the step inconsistency with a grib flavour. I could indeed also workaround the time inconsistency with further flavours.

flavour:
  - - {step: "0s", time: 0}
    - {step: 0, time: 0}
  - - {step: "0s", time: 600}
    - {step: 0, time: 6}
  - - {step: "0s", time: 1200}
    - {step: 0, time: 12}
  - - {step: "0s", time: 1800}
    - {step: 0, time: 18}

I was wondering, if the format and unit of time were somewhere documented or specified to verify if my fix goes in a good direction.

When it comes to the steps, I think in general, the grouping should not just simply compare string or int values, i.e. it should tread 0, "0s", "0h" and "0d" the same, as well as "3600s" and "1h".


def __init__(self, field: Any, valid_datetime: Any) -> None:
date = int(valid_datetime.date().strftime("%Y%m%d"))
assert valid_datetime.time().minute == 0, valid_datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we still have an assert to check minutes need to be equal to 0 or why we want to support mins!=0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are use cases of sub-hourly data and support was added to anemoi-datasets a while ago. However, I'm nut sure, if they are supposed to be combined with repeated-dates. So I though this assert was an oversight.

@HCookie HCookie moved this to Reviewers needed in Anemoi-dev Nov 17, 2025
@MeraX MeraX requested a review from b8raoult November 27, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Reviewers needed

Development

Successfully merging this pull request may close these issues.

3 participants