Skip to content

Conversation

@jehangirawan
Copy link

Summary

Fixes #1113

This PR addresses the NaN issue when converting Zarr data to NetCDF for reduced Gaussian grids.

Problem

The original code assumed a regular lat/lon grid and used unstack() to create rectangular dimensions. For reduced Gaussian grids (where longitude points vary by latitude), this created a rectangular grid filled with NaNs for non-existent grid points.

Solution

  • Added automatic grid type detection (regular vs Gaussian)
  • Preserve Gaussian grids as unstructured grids with CF auxiliary coordinates
  • Use ncells dimension with auxiliary lat/lon variables (CF-1.12 compliant)

Changes

  • packages/evaluate/src/weathergen/evaluate/export_inference.py: Core conversion logic

Testing

uv run ./packages/evaluate/src/weathergen/evaluate/export_inference.py \
  --run-id <RUN_ID> --stream ERA5 --output-dir ./netcdf_out \
  --format netcdf --type prediction --samples 1

Results:

  • No NaNs in output files
  • All grid points preserved correctly
  • CF-1.12 compliant output
  • Compatible with xarray, CDO, Panoply

Before: Most grid points showed as NaN or _
After: All valid data points preserved

Note

This PR implements native Gaussian grid preservation as discussed in the issue.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

Dr. Jehangir Awan added 2 commits October 23, 2025 13:46
- Add automatic grid type detection (regular vs Gaussian)
- Preserve Gaussian grids using CF auxiliary coordinates (ncells dimension)

The converter now properly handles reduced Gaussian grids by keeping them as unstructured grids with CF auxiliary lat/lon coordinates, following CF-1.12 conventions.
- Add automatic grid type detection (regular vs Gaussian)
- Preserve Gaussian grids using CF auxiliary coordinates (ncells dimension)

The converter now properly handles reduced Gaussian grids by keeping them as unstructured grids with CF auxiliary lat/lon coordinates, following CF-1.12 conventions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Missing data (NaNs) in NetCDF after zarr_to_netcdf conversion

1 participant