Limit inode consumption of output data

### Is your feature request related to a problem? Please describe.

For IMERG data, the grid used is of finer resolution reduced Gaussian grid with much more grid points compared to ERA5. Right now, the target_times which repeats for all grid points is concatenated.
It gets difficult to operate on such a large array, i.e. when we generate zarr file containing inference samples for 1 month  with 6 hour timestamps.

### Describe the solution you'd like

currently chunk size can be changed using the module level variable `utils.io.CHUNK_N_SAMPLES`. To cope better with streams as described above, some logic should be implemented keep the number of chunks reasonable. Two possible solutions could be:

A) Define the number of chunks as a constant instead of the size of a chunk => scale chunksize per stream accordingly
B) Add optional parameter `chunk_size` to stream config files: if it is present it is passed to the relevant method (`utils.io.ZarrIO._write_arrays`) otherwise `utils.io.CHUNK_N_SAMPLES` is used

### Describe alternatives you've considered

Using zarr3 is currently not possible due to incompatibility with [anemoi-dataset](https://github.com/ecmwf/anemoi-datasets/pull/220). However once this is fixed zarr 3 should be used additionally to the ability to specify stream chunk size.

### Additional context

Option A) would add less complexity to the code/configuration and is easier to implement. option B) would provide more control to the user. As this ultimately is more of a temporary fix to limit the inode consumption until the capabilities of zarr3 can be used, I would argue option A) should be the preferred solution.

### Organisation

JSC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limit inode consumption of output data #394

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Limit inode consumption of output data #394

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions