-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix eviction cached blocked logic
performance
Performance-related issues
v1
#21357
opened Jul 22, 2025 by
simon-mo
Loading…
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI
ci/build
documentation
Improvements or additions to documentation
performance
Performance-related issues
tpu
Related to Google TPUs
#21355
opened Jul 22, 2025 by
yeqcharlotte
Loading…
4 tasks done
[xpu] disable cudagraph for xpu platform
#21354
opened Jul 22, 2025 by
chaojun-zhang
•
Draft
4 tasks
Decode Tokenized IDs to Strings for
hf_processor
in llm.chat()
with model_impl=transformers
#21353
opened Jul 22, 2025 by
ariG23498
Loading…
[Core][Feat] Add max-waiting-queue-length parameter to reject requests when waiting queue is full
frontend
v1
#21352
opened Jul 22, 2025 by
chaunceyjiang
Loading…
3 of 4 tasks
[Core] Minor comments and asserts changes in block pool
v1
#21351
opened Jul 22, 2025 by
Jialin
Loading…
3 of 4 tasks
[AMD][BugFix] Fix omission of wvSplitK kernel due to torch.compile
rocm
Related to AMD ROCm
#21350
opened Jul 22, 2025 by
rasmith
Loading…
[Misc] Remove deprecated args in v0.10
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
[Core] Guided decoding v0 deprecation
ci/build
frontend
structured-output
v1
#21347
opened Jul 22, 2025 by
rzabarazesh
•
Draft
1 of 4 tasks
[CI] Unifying Dockerfiles for ARM and X86 Builds
ci/build
documentation
Improvements or additions to documentation
#21343
opened Jul 22, 2025 by
kebe7jun
Loading…
3 of 4 tasks
Add anthropic endpoint
documentation
Improvements or additions to documentation
frontend
tool-calling
v1
#21341
opened Jul 22, 2025 by
SriRangaTarun
•
Draft
[TPU][Bugfix] fix moe layer
tpu
Related to Google TPUs
v1
#21340
opened Jul 22, 2025 by
yaochengji
Loading…
Support DeepSeekV3-style block FP8 quantization with CT
deepseek
Related to DeepSeek models
#21337
opened Jul 21, 2025 by
mgoin
Loading…
[Refactor] Remove Performance-related issues
moe_align_block_size_triton
performance
#21335
opened Jul 21, 2025 by
yewentao256
Loading…
Add think chunk
deepseek
Related to DeepSeek models
frontend
#21333
opened Jul 21, 2025 by
juliendenize
Loading…
Adds parallel model weight loading for runai_streamer
ci/build
#21330
opened Jul 21, 2025 by
bbartels
Loading…
4 tasks
[Core] Convert EngineCoreRequest to Request before reaching the engine core …
v1
#21329
opened Jul 21, 2025 by
Jialin
Loading…
3 of 4 tasks
[P/D] Move FakeNixlWrapper to test dir
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#21328
opened Jul 21, 2025 by
ruisearch42
Loading…
3 of 4 tasks
Fix Flashifner Allreduce+Norm enable disable calculation based on
fi_allreduce_fusion_max_token_num
#21325
opened Jul 21, 2025 by
xinli-git
Loading…
3 tasks done
[Misc] Add dummy maverick test to CI
ci/build
multi-modality
Related to multi-modality (#4194)
#21324
opened Jul 21, 2025 by
minosfuture
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.