feat: log convert metrics to benchmark local runs #26

wietzesuijker · 2025-08-21T18:17:50Z

Add basic observability to compare runs and debug slowdowns.

What changed

Feat (CLI): Local Dask controls
- --dask-mode {threads,processes,single-threaded}
- --dask-workers, --dask-threads-per-worker
- Optional --dask-perf-html <path> for Dask HTML performance report
Metrics: Always write to <output>/debug/ (on success and failure)
- dask_run_summary.json (params + wall time)
- dask_metrics.json and timestamped dask_metrics_<run_id>_attemptN.json
- Fields: wall_clock_s, workers, threads_total, tasks_observed, tasks_per_sec, compute_time_s_sum, transfer_time_s_sum, memory_used_bytes, memory_limit_bytes, optional spilled_nbytes, dashboard_link.
- Task timing parsed defensively across Dask versions.

Dependencies

Add: ipykernel, jupyter, bokeh (for testing/diagnostics/UI. I should probably ditch .ipynb support, it was useful for testing).

Compat

Defaults mirror prior behavior (threads, 4×1)

Example

uv run eopf-geozarr convert --dask-cluster --dask-perf-html out/debug/dask-report-threads.html \  
 "https://objectstore.eodc.eu:2222/e05ab01a9d56408d82ac32d69a5aae2a:202505-s02msil2a/18/products/cpm_v256/S2B_MSIL2A_20250518T112119_N0511_R037_T29RLL_20250518T140519.zarr" \
  ./S2B_MSIL2A_20250518_T29RLL_geozarr.zarr --verbose \
  --groups measurements/reflectance/r10m measurements/reflectance/r20m measurements/reflectance/r60m
# Inspect OUT_DIR/debug/{dask_run_summary.json,dask_metrics.json}

Somewhat worryingly, this example took four attempts the last time I tried (it skips data already processed), the last attempt produced this dask_metrics.json

{
  "status": "ok",
  "run_id": "20250820-145816",
  "attempt": 4,
  "dask_enabled": true,
  "mode": "threads",
  "wall_clock_s": 416.0143037919988,
  "workers": 0,
  "threads_total": 0,
  "tasks_observed": 260,
  "tasks_per_sec": 0.6249785106667781,
  "compute_time_s_sum": 1444.2037508487701,
  "transfer_time_s_sum": 0.023849010467529297,
  "memory_used_bytes": 0,
  "memory_limit_bytes": 0,
  "dashboard_link": "http://192.168.0.12:8787/status"
}

the uv.lock diff looks kinda gross, I haven't used uv much, sorry for that.

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 6. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v4...v6) --- updated-dependencies: - dependency-name: actions/setup-python dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

feat: log convert metrics to benchmark local runs

401c032

wietzesuijker mentioned this pull request Aug 25, 2025

Feat/initial workflow EOPF-Explorer/data-model-pipeline#1

Merged

wietzesuijker and others added 6 commits August 27, 2025 07:23

feat: log convert metrics to benchmark local runs

2a4c308

feat: intermediate state of workflow

35e46e7

feat: intermediate state of workflow 2

9df2b6d

wip

71f4e39

This was referenced Sep 15, 2025

metrics(v1): structured conversion metrics + CLI flags wietzesuijker/data-model#1

Open

conversion refactor with resumable multiscales + JSON metrics #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: log convert metrics to benchmark local runs #26

feat: log convert metrics to benchmark local runs #26

Uh oh!

wietzesuijker commented Aug 21, 2025

Uh oh!

Uh oh!

feat: log convert metrics to benchmark local runs #26

Are you sure you want to change the base?

feat: log convert metrics to benchmark local runs #26

Uh oh!

Conversation

wietzesuijker commented Aug 21, 2025

Uh oh!

Uh oh!