Add pre build wheel install for better development experience#188
Add pre build wheel install for better development experience#188jikunshang wants to merge 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
There was a problem hiding this comment.
Pull request overview
This PR adds support for using pre-built wheels during development installation of vllm-xpu-kernels, allowing developers to skip the compilation step by either specifying a local wheel path or having the setup script download one automatically.
Changes:
- Adds
VLLM_USE_PRECOMPILEDandVLLM_PRECOMPILED_WHEEL_LOCATIONenvironment variable support intools/envs.py - Adds
precompiled_wheel_utilsclass andprecompiled_build_extclass tosetup.pyto extract.sofiles from an existing wheel and bypass CMake builds - Moves
ext_modulesandcmdclassinitialization after the precompiled logic block
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| tools/envs.py | Defines VLLM_USE_PRECOMPILED env var, auto-enabling it when VLLM_PRECOMPILED_WHEEL_LOCATION is set |
| setup.py | Adds utilities and logic for extracting precompiled binaries from a wheel and skipping the CMake build |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| wheel_filename = wheel_url_or_path.split("/")[-1] | ||
| temp_dir = tempfile.mkdtemp(prefix="vllm-wheels") | ||
| wheel_path = os.path.join(temp_dir, wheel_filename) | ||
| print(f"Downloading wheel from {wheel_url_or_path}" |
There was a problem hiding this comment.
Missing space between the two f-strings — the output will read "...from to " without a space separating the two parts.
| print(f"Downloading wheel from {wheel_url_or_path}" | |
| print(f"Downloading wheel from {wheel_url_or_path} " |
| print(f"version: get_vllm_version()={get_vllm_version()}") | ||
| wheel_location = os.getenv( | ||
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | ||
| f"./vllm_xpu_kernels-{get_vllm_version()}-cp312-cp312-linux_x86_64.whl" |
There was a problem hiding this comment.
This debug print statement calls get_vllm_version() a second time (it's called again on line 430), which may trigger side effects (e.g., writing _version.py). Consider capturing the result in a variable and printing it instead.
| print(f"version: get_vllm_version()={get_vllm_version()}") | |
| wheel_location = os.getenv( | |
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | |
| f"./vllm_xpu_kernels-{get_vllm_version()}-cp312-cp312-linux_x86_64.whl" | |
| vllm_version = get_vllm_version() | |
| print(f"version: get_vllm_version()={vllm_version}") | |
| wheel_location = os.getenv( | |
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | |
| f"./vllm_xpu_kernels-{vllm_version}-cp312-cp312-linux_x86_64.whl" |
| wheel_location = os.getenv( | ||
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | ||
| f"./vllm_xpu_kernels-{get_vllm_version()}-cp312-cp312-linux_x86_64.whl" | ||
| ) | ||
| if wheel_location is not None: |
There was a problem hiding this comment.
os.getenv() with a default value never returns None — when VLLM_PRECOMPILED_WHEEL_LOCATION is not set, it returns the hardcoded default path instead of None. This means the else branch (lines 435–456) that handles downloading from a remote URL is dead code and will never execute.
| wheel_location = os.getenv( | |
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | |
| f"./vllm_xpu_kernels-{get_vllm_version()}-cp312-cp312-linux_x86_64.whl" | |
| ) | |
| if wheel_location is not None: | |
| wheel_location = os.getenv("VLLM_PRECOMPILED_WHEEL_LOCATION") | |
| if wheel_location: |
| print(f"version: get_vllm_version()={get_vllm_version()}") | ||
| wheel_location = os.getenv( | ||
| "VLLM_PRECOMPILED_WHEEL_LOCATION", | ||
| f"./vllm_xpu_kernels-{get_vllm_version()}-cp312-cp312-linux_x86_64.whl" |
There was a problem hiding this comment.
The Python version tag cp312-cp312 is hardcoded in the default wheel filename. This will silently use the wrong file on Python 3.11 or other versions. Consider deriving the tag dynamically using sysconfig or sys.version_info.
| print(f"Removing temporary directory {temp_dir}") | ||
| shutil.rmtree(temp_dir) | ||
|
|
||
| # TODO: not used currently. |
There was a problem hiding this comment.
The method is annotated with # TODO: not used currently but is still called on line 444 inside the dead else branch (see Comment 3). This is confusing — the TODO comment and the actual usage are inconsistent. The comment should be removed or updated to reflect the actual intent.
| # TODO: not used currently. |
|
|
||
|
|
||
| ext_modules = [] | ||
| class precompiled_wheel_utils: |
There was a problem hiding this comment.
Class names should use CapWords (PascalCase) per PEP 8. This class should be named PrecompiledWheelUtils.
| class precompiled_wheel_utils: | |
| class PrecompiledWheelUtils: |
| ]).decode("utf-8").strip()) | ||
| return base_commit | ||
| except ValueError as err: | ||
| raise ValueError(err) from None |
There was a problem hiding this comment.
Re-raising a ValueError while suppressing the original exception chain (from None) discards the original traceback, making debugging harder. Consider using raise to re-raise or raise ValueError(...) from err to preserve the cause.
| raise ValueError(err) from None | |
| raise ValueError(err) from err |
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.
Purpose
reopen #67
usage:
VLLM_PRECOMPILED_WHEEL_LOCATIONTest Plan
Test Result
(Optional) Documentation Update
BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)