Skip to content

Conversation

@d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Nov 24, 2025

This is an extensive, but still first-order, refactorization of the optimized s2 multiscales conversion to use the new multiscales metadata.

A few key changes:

  • use stateless functions instead of OOP. The S2OptimizedConverter, S2MultiscalePyramid, and S2OptimizationValidator classes have been broken out into functions.
  • use xarray's built-in downsampling routines instead of hand-rolled downsampling functions. The old downsampling code is still there but we don't need it anymore.
  • Use local copies of sentinel1 and sentinel2 Zarr groups (with no chunks) for local testing. This makes the local tests much faster, and removes the need for accessing remote data.

@d-v-b d-v-b marked this pull request as draft November 24, 2025 11:06
@d-v-b d-v-b marked this pull request as ready for review November 24, 2025 11:43
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 81.02410% with 126 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/eopf_geozarr/s2_optimization/s2_multiscale.py 78.16% 76 Missing ⚠️
src/eopf_geozarr/s2_optimization/s2_converter.py 76.82% 35 Missing ⚠️
src/eopf_geozarr/s2_optimization/s2_resampling.py 89.70% 7 Missing ⚠️
src/eopf_geozarr/conversion/geozarr.py 75.00% 4 Missing ⚠️
src/eopf_geozarr/data_api/geozarr/common.py 80.00% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@d-v-b d-v-b requested a review from emmanuelmathot November 24, 2025 12:59
"scale_factor": 2**level,
"translation_relative": 0.0,
"scale_absolute": 1.0,
"scale_relative": 2**level,
Copy link

@vincentsarago vincentsarago Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we imaging other base scale factors than 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Added `initialize_crs_from_dataset` function to extract CRS from dataset metadata.
- Updated S2 optimization commands to include new CRS handling.
- Removed unused arguments related to geometry and meteorology groups.
- Added comprehensive tests for CRS initialization from various sources.
Copy link
Contributor

@emmanuelmathot emmanuelmathot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides my addings for a better support of CRS after CPM 2.6.0, this is great!
I validated with a test in titiler-eopf

Comment on lines +122 to +137
# verify exact output group structure
# this is a sensitive, brittle check
expected_structure_json = tuplify_json(
json.loads(
(
Path("tests/test_data_api/geozarr_examples/")
/ (s2_group_example.stem + ".json")
).read_text()
)

# Save as zarr
ds.to_zarr(test_input, zarr_format=3)
ds.close()

# Test CLI with --crs-groups but no groups specified
cmd = [
)
observed_structure_json = tuplify_json(
GroupSpec.from_zarr(zarr.open_group(output_path)).model_dump()
)
assert expected_structure_json == observed_structure_json, view_json_diff(
expected_structure_json, observed_structure_json
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmanuelmathot this is an important check -- we are very strictly checking the structure of the zarr group against a reference JSON document. Any time we change the output format, this will fail, which is a sign that we need to update the JSON files that represent the expected Zarr group structure

return
multiscales: TMSMultiscalesAttrsJSON | MultiscalesAttrsJSON

if multiscales_flavor == "ogc_tms":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont you think we could simply have both multiscales flavor combined until we are aligned and #77 is implemented and integrated in titiler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I can make this kwarg a set of strings instead of as single string

@@ -0,0 +1,16098 @@
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmanuelmathot sorry for the number of lines here but this JSON document is used to check the output of the s2 optimized conversion. that means whenever we change the output of the conversion, we need to update this document. I think we can rely on better checks in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants