Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Bugfix] Fix eviction cached blocked logic performance Performance-related issues v1
#21357 opened Jul 22, 2025 by simon-mo Loading…
[Misc] unify variable for LLM instance v2
#21356 opened Jul 22, 2025 by andyxning Loading…
4 tasks
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI ci/build documentation Improvements or additions to documentation performance Performance-related issues tpu Related to Google TPUs
#21355 opened Jul 22, 2025 by yeqcharlotte Loading…
4 tasks done
[xpu] disable cudagraph for xpu platform
#21354 opened Jul 22, 2025 by chaojun-zhang Draft
4 tasks
[Core] Minor comments and asserts changes in block pool v1
#21351 opened Jul 22, 2025 by Jialin Loading…
3 of 4 tasks
[AMD][BugFix] Fix omission of wvSplitK kernel due to torch.compile rocm Related to AMD ROCm
#21350 opened Jul 22, 2025 by rasmith Loading…
[Misc] Remove deprecated args in v0.10 documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding
#21349 opened Jul 22, 2025 by kebe7jun Loading… v0.10.0
[CI] Unifying Dockerfiles for ARM and X86 Builds ci/build documentation Improvements or additions to documentation
#21343 opened Jul 22, 2025 by kebe7jun Loading…
3 of 4 tasks
Add anthropic endpoint documentation Improvements or additions to documentation frontend tool-calling v1
#21341 opened Jul 22, 2025 by SriRangaTarun Draft
[TPU][Bugfix] fix moe layer tpu Related to Google TPUs v1
#21340 opened Jul 22, 2025 by yaochengji Loading…
Support DeepSeekV3-style block FP8 quantization with CT deepseek Related to DeepSeek models
#21337 opened Jul 21, 2025 by mgoin Loading…
[Refactor] Remove moe_align_block_size_triton performance Performance-related issues
#21335 opened Jul 21, 2025 by yewentao256 Loading…
Add think chunk deepseek Related to DeepSeek models frontend
#21333 opened Jul 21, 2025 by juliendenize Loading…
Support Tensorrt-LLM MoE fp4 for low-latency
#21331 opened Jul 21, 2025 by wenscarl Draft
4 tasks
Adds parallel model weight loading for runai_streamer ci/build
#21330 opened Jul 21, 2025 by bbartels Loading…
4 tasks
[P/D] Move FakeNixlWrapper to test dir ready ONLY add when PR is ready to merge/full CI is needed v1
#21328 opened Jul 21, 2025 by ruisearch42 Loading…
3 of 4 tasks
[Misc] Add dummy maverick test to CI ci/build multi-modality Related to multi-modality (#4194)
#21324 opened Jul 21, 2025 by minosfuture Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.