Skip to content

[FEAT] make ray optional and enable local backend execution (without ray)#127

Open
VincentG1234 wants to merge 2 commits intoopenshift-psap:mainfrom
VincentG1234:FEAT/ray-optional
Open

[FEAT] make ray optional and enable local backend execution (without ray)#127
VincentG1234 wants to merge 2 commits intoopenshift-psap:mainfrom
VincentG1234:FEAT/ray-optional

Conversation

@VincentG1234
Copy link
Copy Markdown
Contributor

@VincentG1234 VincentG1234 commented Mar 13, 2026

Description

This PR makes Ray an optional dependency and adds support for a local execution backend, allowing users to run auto-tune-vllm on a single machine without a Ray cluster.

Changes

  • Move ray[default] from required to optional (pip install auto-tune-vllm[ray])

  • CLI: Add --backend local option alongside existing ray backend

  • Python environment options (--python-executable, --venv-path, --conda-env) now only required for Ray backend

  • max_concurrent_trials defaults to 1 for local backend (no longer required)

  • All Ray imports are now lazy/conditional - the tool works without Ray installed

Usage

# Local execution (no Ray needed)
auto-tune-vllm optimize --backend local -c config.yaml

# Ray execution (unchanged)
auto-tune-vllm optimize --backend ray --venv-path ./venv -c config.yaml

Backward Compatibility

Default backend remains ray, existing commands continue to work and users with Ray installed are unaffected.
Clear error message if Ray backend is requested but not installed

Summary by CodeRabbit

  • New Features

    • Added local execution backend as an alternative to Ray, enabling the tool to run without Ray installed
  • Improvements

    • Made Ray an optional dependency; install with the ray extra for distributed execution
    • Backend-specific validation now tailors configuration requirements based on selected execution backend

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

The PR makes Ray an optional dependency and introduces LocalExecutionBackend as an alternative execution backend. Changes include: refactoring Ray dependencies to be lazy-loaded with fallbacks, conditionally guarding Ray API calls throughout the codebase, restructuring CancellationFlag as a private class, adding local backend selection logic to the CLI, and migrating Ray from required to optional dependencies in the project manifest.

Changes

Cohort / File(s) Summary
CLI Backend Support
auto_tune_vllm/cli/main.py
Added LocalExecutionBackend alongside RayExecutionBackend with backend selection logic, expanded help text, implemented backend-specific validation (Ray requires Python environment options; local backend allows omission), and adjusted max_concurrent_trials defaults for local execution.
Execution Backend Infrastructure
auto_tune_vllm/execution/backends.py
Replaced public CancellationFlag with private _CancellationFlag to decouple Ray decorator dependency, introduced lazy Ray import in RayExecutionBackend.init with ImportError handling, and updated all Ray API calls to use stored self._ray reference for consistent dependency management.
Trial Controller Ray Optionality
auto_tune_vllm/execution/trial_controller.py
Made Ray an optional import with try/except fallback, conditionally defined RayTrialActor (set to None if Ray unavailable), added guards before all Ray API calls, implemented fallback worker ID retrieval ("ray_worker_unknown"), and adjusted environment validation to register Ray as required only when available.
Dependency Configuration
pyproject.toml
Removed Ray from main dependencies array and added new [project.optional-dependencies] section exposing "ray[default]>=2.0.0" as an optional dependency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • thameem-abbas
  • aas008
  • ephoris

Poem

🐰 A local path emerges, Ray takes a rest,
Optional now, no longer a guest—
Cancellation flags hide behind the scenes,
While backends dance between two extremes,
Hop forward with choice, the codebase renewed! 🌿✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[FEAT] make ray optional and enable local backend execution (without ray)' accurately describes the main changes: making Ray optional and adding local backend support. It is specific, clear, and directly reflects the primary objective of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
auto_tune_vllm/execution/backends.py (1)

591-599: ⚠️ Potential issue | 🟡 Minor

cleanup_all_trials doesn't cancel running futures.

Unlike RayExecutionBackend, which has sophisticated cancellation logic, LocalExecutionBackend.cleanup_all_trials is a stub that does nothing. If a user needs to abort, long-running local trials won't be interrupted. Additionally, shutdown(wait=True) will block until all submitted work completes.

🔧 Suggested improvement
     def cleanup_all_trials(self):
-        """Cleanup all active trials (stub implementation for local backend)."""
-        logger.info("Local backend does not require explicit trial cleanup")
-        # Local backend doesn't need to do anything special here
-        # Individual trial controllers handle their own cleanup when they complete
+        """Cancel all active trials and clean up resources."""
+        if not self.active_futures:
+            logger.debug("No active trials to cleanup")
+            return
+
+        logger.info(f"Cleaning up {len(self.active_futures)} active local trial(s)")
+        for job_id, future in list(self.active_futures.items()):
+            if not future.done():
+                future.cancel()
+                logger.debug(f"Cancelled future {job_id}")
+        self.active_futures.clear()
+        logger.info("✓ Completed cleanup of all active local trials")

     def shutdown(self):
         """Shutdown thread pool executor."""
-        self.executor.shutdown(wait=True)
+        self.executor.shutdown(wait=False, cancel_futures=True)
+        logger.info("Shutdown local execution backend")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@auto_tune_vllm/execution/backends.py` around lines 591 - 599,
cleanup_all_trials currently is a no-op so long-running local trials aren't
interrupted; update LocalExecutionBackend.cleanup_all_trials to iterate over the
backend's active futures (track them if not already, e.g., the collection used
when submitting tasks) and call cancel() on each running Future and clear the
tracking collection, and then have shutdown call
self.executor.shutdown(wait=False) (or call shutdown after cancelling with an
optional timeout and then join) so the process does not block waiting for
cancelled/long-running tasks to finish; reference the methods
cleanup_all_trials, shutdown, self.executor and the internal active-futures
collection when making the change.
🧹 Nitpick comments (1)
auto_tune_vllm/execution/backends.py (1)

248-258: Consider removing redundant Ray re-import.

Lines 250-252 re-import Ray and reassign self._ray, but this is unnecessary since __init__ already imports and stores it. If __init__ succeeded, self._ray is guaranteed to be set.

♻️ Suggested simplification
     def submit_trial(self, trial_config: TrialConfig) -> JobHandle:
         """Submit trial to Ray cluster."""
-        import ray
-
-        self._ray = ray
+        ray = self._ray
         from .trial_controller import RayTrialActor
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@auto_tune_vllm/execution/backends.py` around lines 248 - 258, The
submit_trial method redundantly re-imports Ray and reassigns self._ray; remove
the local import and assignment (the lines importing ray and setting self._ray)
and instead use the ray reference stored on the instance from __init__; update
the creation of CancellationFlagActor to use self._ray.remote(_CancellationFlag)
(or simply use self._ray when calling remote) so symbols to change are
submit_trial, self._ray, CancellationFlagActor, and _CancellationFlag.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@auto_tune_vllm/execution/backends.py`:
- Around line 591-599: cleanup_all_trials currently is a no-op so long-running
local trials aren't interrupted; update LocalExecutionBackend.cleanup_all_trials
to iterate over the backend's active futures (track them if not already, e.g.,
the collection used when submitting tasks) and call cancel() on each running
Future and clear the tracking collection, and then have shutdown call
self.executor.shutdown(wait=False) (or call shutdown after cancelling with an
optional timeout and then join) so the process does not block waiting for
cancelled/long-running tasks to finish; reference the methods
cleanup_all_trials, shutdown, self.executor and the internal active-futures
collection when making the change.

---

Nitpick comments:
In `@auto_tune_vllm/execution/backends.py`:
- Around line 248-258: The submit_trial method redundantly re-imports Ray and
reassigns self._ray; remove the local import and assignment (the lines importing
ray and setting self._ray) and instead use the ray reference stored on the
instance from __init__; update the creation of CancellationFlagActor to use
self._ray.remote(_CancellationFlag) (or simply use self._ray when calling
remote) so symbols to change are submit_trial, self._ray, CancellationFlagActor,
and _CancellationFlag.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 38cc8570-5783-45a1-83d1-64c9116a5a5d

📥 Commits

Reviewing files that changed from the base of the PR and between acff360 and 7ca82f2.

📒 Files selected for processing (4)
  • auto_tune_vllm/cli/main.py
  • auto_tune_vllm/execution/backends.py
  • auto_tune_vllm/execution/trial_controller.py
  • pyproject.toml

Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>
Signed-off-by: Vincent Gimenes <vincent.gimenes@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant