Skip to content

[codex] fix(simple): cooperatively cancel specprefill workers#280

Draft
Thump604 wants to merge 2 commits intowaybarrios:mainfrom
Thump604:codex/simpleengine-specprefill-cancel-check
Draft

[codex] fix(simple): cooperatively cancel specprefill workers#280
Thump604 wants to merge 2 commits intowaybarrios:mainfrom
Thump604:codex/simpleengine-specprefill-cancel-check

Conversation

@Thump604
Copy link
Copy Markdown
Collaborator

Summary

  • add cooperative cancellation to the SimpleEngine SpecPrefill worker path
  • thread a single cancel_check hook through _prefill_draft, score_tokens, and sparse_prefill
  • add regression coverage for scoring-time and sparse-prefill cancellation without changing the current main decode shape

Why

_run_blocking_serialized() now keeps the async generation lock held until the worker thread actually finishes, but long-running SpecPrefill phases still need a cooperative exit path after request cancellation. Without that, cancelled SpecPrefill requests can keep burning CPU inside blocking scoring/prefill loops even though the request is already gone.

Validation

  • black --fast --check vllm_mlx/engine/simple.py vllm_mlx/specprefill.py tests/test_simple_engine.py
  • pytest -q tests/test_simple_engine.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant