You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document lists enhancements observed while integrating `eopf-geozarr` into an Argo-based batch conversion pipeline (data-model-pipeline). They are candidates for upstream inclusion or API refinement.
4
+
5
+
## 1. Output Prefix Expansion
6
+
**Current (pipeline)**: Wrapper detects `output_zarr` ending with `/` and appends `<item_id>_geozarr.zarr` derived from the input STAC/Zarr URL.
7
+
**Recommendation**: Support `--output-prefix` OR accept a trailing slash on positional `output_path` and perform the expansion internally (emitting the resolved final path). Add a log line: `Resolved output store: s3://.../S2A_..._geozarr.zarr`.
**Recommendation**: Native CLI flag `--validate-groups` to prune or fail fast when groups don’t exist. Modes:
12
+
-`--validate-groups=warn` (default): drop missing, report.
13
+
-`--validate-groups=error`: abort if any missing.
14
+
Emit JSON or structured summary when `--verbose`.
15
+
16
+
## 3. Profiles (WOZ Profiles)
17
+
**Current (pipeline)**: External JSON profile expansion before calling CLI.
18
+
**Recommendation**: Provide `--profile <name>` in CLI mapping to preset groups + chunk params. Add `eopf-geozarr profile list` / `profile show <name>` subcommands. Keep external mechanism as fallback.
19
+
20
+
## 4. Compressor Handling
21
+
**Current**: Template attempted to pass `--compressor`; CLI does not expose codec choice.
22
+
**Recommendation**: If codec selection is desired, add `--compressor <name>` now (zstd, lz4, blosc) with validation; else document fixed default explicitly to avoid confusion.
23
+
24
+
## 5. CRS Groups Convenience
25
+
**Current**: `--crs-groups` optional list.
26
+
**Recommendation**: Discover candidate groups automatically (search for geometry-like datasets) unless `--crs-groups` provided (override). Provide `eopf-geozarr info --crs-scan` to preview.
27
+
28
+
## 6. Dask Cluster Ergonomics
29
+
**Current**: `--dask-cluster` toggles local cluster with no feedback.
30
+
**Recommendation**: Print cluster dashboard URL (if available) and add `--dask-workers N` for quick scaling.
31
+
32
+
## 7. Structured Logging / Run Metadata
33
+
**Current**: Plain prints; pipeline scrapes logs.
34
+
**Recommendation**: Optional `--run-metadata <path.json>` to write machine-readable summary: inputs, resolved groups, timings, warnings. Eases automation and reproducibility.
35
+
36
+
## 8. Validation Command Enhancements
37
+
**Current**: `validate` skeleton present but incomplete.
0 commit comments