Skip to content

Conversation

wietzesuijker
Copy link
Contributor

Add basic observability to compare runs and debug slowdowns.

What changed

  • Feat (CLI): Local Dask controls
    • --dask-mode {threads,processes,single-threaded}
    • --dask-workers, --dask-threads-per-worker
    • Optional --dask-perf-html <path> for Dask HTML performance report
  • Metrics: Always write to <output>/debug/ (on success and failure)
    • dask_run_summary.json (params + wall time)
    • dask_metrics.json and timestamped dask_metrics_<run_id>_attemptN.json
    • Fields: wall_clock_s, workers, threads_total, tasks_observed, tasks_per_sec, compute_time_s_sum, transfer_time_s_sum, memory_used_bytes, memory_limit_bytes, optional spilled_nbytes, dashboard_link.
    • Task timing parsed defensively across Dask versions.

Dependencies

  • Add: ipykernel, jupyter, bokeh (for testing/diagnostics/UI. I should probably ditch .ipynb support, it was useful for testing).

Compat

  • Defaults mirror prior behavior (threads, 4×1)

Example

uv run eopf-geozarr convert --dask-cluster --dask-perf-html out/debug/dask-report-threads.html \  
 "https://objectstore.eodc.eu:2222/e05ab01a9d56408d82ac32d69a5aae2a:202505-s02msil2a/18/products/cpm_v256/S2B_MSIL2A_20250518T112119_N0511_R037_T29RLL_20250518T140519.zarr" \
  ./S2B_MSIL2A_20250518_T29RLL_geozarr.zarr --verbose \
  --groups measurements/reflectance/r10m measurements/reflectance/r20m measurements/reflectance/r60m
# Inspect OUT_DIR/debug/{dask_run_summary.json,dask_metrics.json}

Somewhat worryingly, this example took four attempts the last time I tried (it skips data already processed), the last attempt produced this dask_metrics.json

{
  "status": "ok",
  "run_id": "20250820-145816",
  "attempt": 4,
  "dask_enabled": true,
  "mode": "threads",
  "wall_clock_s": 416.0143037919988,
  "workers": 0,
  "threads_total": 0,
  "tasks_observed": 260,
  "tasks_per_sec": 0.6249785106667781,
  "compute_time_s_sum": 1444.2037508487701,
  "transfer_time_s_sum": 0.023849010467529297,
  "memory_used_bytes": 0,
  "memory_limit_bytes": 0,
  "dashboard_link": "http://192.168.0.12:8787/status"
}

the uv.lock diff looks kinda gross, I haven't used uv much, sorry for that.

wietzesuijker and others added 6 commits August 27, 2025 07:23
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v4...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant