Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

enable calibration for qwen3-vl and glm-4.6
#388 opened Oct 31, 2025 by yangulei Loading…
enable dynamic quant of the weights on CPU
#386 opened Oct 29, 2025 by yangulei Loading…
add padding ratio limit to the context bucketing
#385 opened Oct 28, 2025 by yangulei Loading…
Enable chunked prefill on aice 1.22
#381 opened Oct 23, 2025 by YuJiankang Loading…
[WA] bypass the GLM OOM issue
#380 opened Oct 15, 2025 by czhu15 Loading…
Enable chunked prefill
#362 opened Sep 14, 2025 by jzhoulon Loading…
[HS-6944] Fix for deepseek distill models
#359 opened Sep 10, 2025 by nazneenn Loading…
[aice/v.1.22] refactor chunk size code
#354 opened Sep 1, 2025 by ranzhejiang Loading…
Fix for Llama4 models (targets main)
#341 opened Aug 19, 2025 by vidyasiv Loading…
Add flag pin_memory to call from hpu.py in vllm
#325 opened Aug 5, 2025 by xuechendi Loading…
Add Calibration Script for SGLang FP8
#318 opened Jul 29, 2025 by SKRohit Loading…
Add block_softmax_adjustment and block_softmax kernels
#289 opened Jul 16, 2025 by czhu15 Loading…
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
ProTip! What’s not been updated in a month: updated:<2025-09-30.