Add `flashinfer_python` to CUDA wheel requirements #21389

mgoin · 2025-07-22T16:27:25Z

Purpose

We have installed flashinfer by default in the docker image for a long time, but now as we use flashinfer for many critical kernels for NVIDIA Blackwell we should consider adding it to the default CUDA dependencies.

The flashinfer-python wheel by default does not include pre-compiled kernels, so users will JIT at runtime.

Test Plan

See if there are any conflicts in CI with the Dockerfile's manual AOT build

Test Result

Signed-off-by: mgoin <[email protected]>

github-actions · 2025-07-22T16:27:33Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request adds flashinfer_python as a dependency for CUDA environments. While this is useful for local development, it introduces a conflict with the existing Docker build process, which builds FlashInfer from source. This leads to a redundant installation and a version mismatch, making the Docker build fragile and potentially incorrect. I've recommended a solution to conditionally exclude this dependency during the Docker build to resolve the conflict.

requirements/cuda.txt

Swipe4057 · 2025-07-25T00:42:19Z

up https://pypi.org/project/flashinfer-python/0.2.9rc1/

Signed-off-by: mgoin <[email protected]>

docker/Dockerfile

Signed-off-by: mgoin <[email protected]>

houseroad

Looks good.

Signed-off-by: mgoin <[email protected]>

Add flashinfer_python by default for CUDA requirements

75909eb

Signed-off-by: mgoin <[email protected]>

mergify bot added the ci/build label Jul 22, 2025

gemini-code-assist bot reviewed Jul 22, 2025

View reviewed changes

requirements/cuda.txt Outdated Show resolved Hide resolved

mgoin changed the title ~~Add flashinfer_python by default for CUDA requirements~~ Add flashinfer_python==0.2.8 by default for CUDA requirements Jul 22, 2025

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 22, 2025

mgoin changed the title ~~Add flashinfer_python==0.2.8 by default for CUDA requirements~~ Add flashinfer_python to CUDA wheel requirements Jul 25, 2025

mgoin added 2 commits July 28, 2025 15:42

Merge branch 'main' into add-flashinfer-python-by-default

43afa3b

Update to 0.2.9rc2

d7915f2

Signed-off-by: mgoin <[email protected]>

tlrmchlsmth reviewed Jul 29, 2025

View reviewed changes

docker/Dockerfile Outdated Show resolved Hide resolved

mgoin added 2 commits July 29, 2025 12:01

Use --force-reinstall --no-deps

11a2cea

Signed-off-by: mgoin <[email protected]>

Merge branch 'main' into add-flashinfer-python-by-default

a03094a

tlrmchlsmth approved these changes Jul 29, 2025

View reviewed changes

houseroad approved these changes Jul 29, 2025

View reviewed changes

houseroad merged commit a33ea28 into vllm-project:main Jul 29, 2025
99 checks passed

liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025

Add flashinfer_python to CUDA wheel requirements (vllm-project#21389)

3295783

Signed-off-by: mgoin <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add `flashinfer_python` to CUDA wheel requirements #21389

Add `flashinfer_python` to CUDA wheel requirements #21389

Uh oh!

mgoin commented Jul 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Swipe4057 commented Jul 25, 2025

Uh oh!

Uh oh!

houseroad left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add flashinfer_python to CUDA wheel requirements #21389

Add flashinfer_python to CUDA wheel requirements #21389

Uh oh!

Conversation

mgoin commented Jul 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Jul 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Swipe4057 commented Jul 25, 2025

Uh oh!

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Add `flashinfer_python` to CUDA wheel requirements #21389

Add `flashinfer_python` to CUDA wheel requirements #21389

mgoin commented Jul 22, 2025 •

edited by github-actions bot

Loading