-
Notifications
You must be signed in to change notification settings - Fork 410
Pull requests: vllm-project/vllm-ascend
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
replace npu_incre_flash_attention with npu_fused_infer_attention_score
#2792
opened Sep 6, 2025 by
panchao-hub
Loading…
refactor fused_moe.py
module:core
module:ops
module:quantization
module:tests
#2791
opened Sep 6, 2025 by
Pr0Wh1teGivee
Loading…
Deepseek Mtp model uses the lm_head and embedding from the main model
module:tests
#2790
opened Sep 5, 2025 by
zzhx1
Loading…
[2/N][Refactor][Quantization] clean quantization patch
module:ops
module:quantization
module:tests
#2785
opened Sep 5, 2025 by
22dimensions
Loading…
[Perf][V1] Fully overlap model execution
merge-conflicts
#2783
opened Sep 5, 2025 by
jiangpeng36
Loading…
[Benchmark] Correctly kill vllm process in performance benchamrk
performance-test
enable performance test for PR
ready-for-test
start test by label for PR
#2782
opened Sep 5, 2025 by
Potabk
Loading…
Remove chunked_prefill_for_mla and fix ring_mla bug
documentation
Improvements or additions to documentation
module:core
#2781
opened Sep 5, 2025 by
SunnyLee151064
Loading…
support qwen25 vl w8a8 quantization
module:quantization
module:tests
#2778
opened Sep 5, 2025 by
wenba0
Loading…
support qwen25 vl w8a8 quantization
module:quantization
module:tests
#2777
opened Sep 5, 2025 by
wenba0
Loading…
Refactor the Spec decode module to merge MTP non-torchair and eagle modes into one file, separating torchair and non-torchair modes #2773
#2776
opened Sep 5, 2025 by
weisirui-eng
Loading…
[0.9.1][Doc] Update supported models
documentation
Improvements or additions to documentation
#2774
opened Sep 5, 2025 by
zhangxinyuehfad
Loading…
[main] addrmsnorm + quant fusion optim in Qwen Models
module:core
module:ops
#2772
opened Sep 5, 2025 by
rjg-lyh
Loading…
Install vllm from source to make doctest passed
documentation
Improvements or additions to documentation
module:tests
[Fix] Ensure metadata sync across DP ranks in eager mode
#2766
opened Sep 5, 2025 by
yiz-liu
Loading…
Add note for Ascend HDK version
documentation
Improvements or additions to documentation
#2765
opened Sep 5, 2025 by
Yikun
Loading…
[main] mlp weight prefetch in Qwen Dense Models
module:core
module:ops
module:tests
#2762
opened Sep 4, 2025 by
rjg-lyh
Loading…
[fix] prefill unsupport sliding window attention
module:tests
#2758
opened Sep 4, 2025 by
NSDie
Loading…
[main] add pd transfer for ascend scheduler
module:ops
module:tests
#2753
opened Sep 4, 2025 by
Liccol
Loading…
[Feat] communication optimization for mc2 ops on A2
module:ops
module:tests
#2752
opened Sep 4, 2025 by
realliujiaxu
Loading…
[DOC] Qwen3 PD disaggregation user guide
documentation
Improvements or additions to documentation
ready
read for review
#2751
opened Sep 4, 2025 by
paulyu12
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.