Skip to content

[codex] Avoid retaining Prefect tasks in Dask scheduler#21703

Draft
zzstoatzz wants to merge 1 commit intomainfrom
codex/fix-dask-scheduler-retention
Draft

[codex] Avoid retaining Prefect tasks in Dask scheduler#21703
zzstoatzz wants to merge 1 commit intomainfrom
codex/fix-dask-scheduler-retention

Conversation

@zzstoatzz
Copy link
Copy Markdown
Collaborator

closes #21697

this PR avoids retaining heavyweight Prefect task objects in Dask scheduler task specs when prefect-dask submits Prefect tasks to an existing Dask scheduler.

Details

The repro showed Dask scheduler RSS growing roughly linearly across repeated flow runs. Scheduler introspection showed no active Dask tasks or task groups after completion, but historical Dask task specs/messages still referenced Prefect Task objects through the submitted task payloads.

This changes Prefect Dask submission so that:

  • Prefect tasks execute through a module-level _run_prefect_task function instead of a per-submit wrapper closure.
  • Prefect Task objects are scattered once per client and passed to Dask tasks by future instead of embedded directly in each task spec.
  • Serialized per-run context is scattered and released when the submitted task future completes.
  • Generated Dask keys still preserve the Prefect task name prefix.

In the issue repro, the patched wheel changed scheduler RSS from repeatedly growing by about 80MiB per run to mostly flattening after the first run. Scheduler introspection after the patched runs still showed retained lightweight Dask task specs, but no retained Prefect Task objects reachable from those specs.

Added a regression test that runs a mapped Prefect task on an existing Dask cluster, then inspects retained dask._task_spec.Task objects on the scheduler to assert they do not reference Prefect Task objects.

Validation:

uv run --project ./src/integrations/prefect-dask pytest src/integrations/prefect-dask/tests/test_client.py src/integrations/prefect-dask/tests/test_task_runners.py::TestDaskTaskRunner::test_dask_task_key_has_prefect_task_name src/integrations/prefect-dask/tests/test_task_runners.py::TestDaskTaskRunner::test_scheduler_does_not_retain_prefect_tasks
uv run --project ./src/integrations/prefect-dask ruff check src/integrations/prefect-dask/prefect_dask/client.py src/integrations/prefect-dask/tests/test_task_runners.py

@github-actions github-actions Bot added the bug Something isn't working label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prefect causes memory leak in Dask scheduler when connecting to an existing scheduler

1 participant