Skip to content

Missing data (NaNs) in NetCDF after zarr_to_netcdf conversion #1113

@jehangirawan

Description

@jehangirawan

Summary

While converting WG output data from Zarr to NetCDF, most grid points appear as missing (NaN or _). This affects variables like t2m and results in mostly empty output files.

Observed Behavior

Most grid points are missing (NaN or _), with only a few valid values.

Steps to Reproduce

Run the conversion script with a run ID and ERA5 sample:

uv run ./packages/evaluate/src/weathergen/evaluate/export_inference.py --run-id <RUN_ID> --stream ERA5 --output-dir ./netcdf_out --format netcdf --type prediction --sample 1

Expected Behavior

All grid points should be populated correctly after conversion or properly interpolated.

Possible Cause / Notes

The source data is in a reduced Gaussian grid, and the conversion may not handle the grid properly. Regridding or interpolation may be required.

Metadata

Metadata

Assignees

Labels

evalanything related to the model evaluation pipeline

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions