-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix][AWQ] skip quantization for FusedMoE layers in modules_to_not_convert
#20128
opened Mar 8, 2026 by
Livinfly
Loading…
5 tasks
[Qwen] Handle tie_word_embeddings for Qwen MoE and Qwen3Next
#20127
opened Mar 8, 2026 by
xingsy97
Loading…
5 tasks
[Feature] Add UVM-based MoE expert offloading with all-GPU compute
documentation
Improvements or additions to documentation
#20126
opened Mar 8, 2026 by
lichang98
Loading…
5 tasks
feat(mem_cache): Add CLOCK second-chance eviction policy for radix KV cache
#20125
opened Mar 8, 2026 by
BitStrider
Loading…
2 of 5 tasks
[Bug] Fix missing TTFT histogram for single-batch requests
high priority
run-ci
#20122
opened Mar 8, 2026 by
hnyls2002
Loading…
[Diffusion] Add fp8 dtype support for encoder loaders
diffusion
SGLang Diffusion
#20121
opened Mar 8, 2026 by
xingsy97
Loading…
5 tasks
[PCG]add piecewise cuda graph support for marlin linear
#20119
opened Mar 8, 2026 by
xieminghe1
Loading…
5 tasks
[diffusion] model: fuse RMSNorm + interleaved RoPE for MOVA SelfAtten…
diffusion
SGLang Diffusion
npu
#20117
opened Mar 8, 2026 by
liubiyongge
Loading…
4 of 5 tasks
[AMD] Add Claude skills for AMD CI workflows
documentation
Improvements or additions to documentation
#20116
opened Mar 8, 2026 by
michaelzhang-ai
Loading…
6 tasks done
[Perf] Enable nextn=2 in deep_gemm.fp8_paged_mqa_logits for target_verify
#20115
opened Mar 8, 2026 by
hammersam
Loading…
5 tasks
fix: support HybridLinearAttnBackend in TboAttnBackend
#20114
opened Mar 8, 2026 by
lawrence-harmonic
Loading…
feat(grpc): add SubscribeKvEvents RPC for KV cache event streaming
#20112
opened Mar 8, 2026 by
slin1237
Loading…
3 of 4 tasks
[AMD] feat: true on policy for triton backend
#20111
opened Mar 8, 2026 by
XinyuJiangCMU
Loading…
6 tasks
[Qwen3.5] Fix MoE double allreduce with
--enable-flashinfer-allreduce-fusion
run-ci
#20110
opened Mar 8, 2026 by
mmangkad
Loading…
[AMD] feat: true on policy for triton backend
#20108
opened Mar 8, 2026 by
XinyuJiangCMU
Loading…
6 tasks
[HiCache][HA 2/N] feat: Support HiCache warmup for cold start acceleration
documentation
Improvements or additions to documentation
hicache
Hierarchical Caching for SGLang
#20105
opened Mar 7, 2026 by
alphabetc1
Loading…
5 tasks
[Model] Add Eagle3 speculative decoding support for Qwen3.5
#20104
opened Mar 7, 2026 by
NikitosKh
Loading…
[JIT] Inject target architecture flag into JIT compilation
#20103
opened Mar 7, 2026 by
xingsy97
Loading…
2 of 5 tasks
[AMD] Fix aiter backend missing ENCODER_ONLY attention support
#20102
opened Mar 7, 2026 by
nathanrchn
Loading…
2 tasks done
[HiCache][HA 3/N] support force attach/detach historage
documentation
Improvements or additions to documentation
hicache
Hierarchical Caching for SGLang
#20101
opened Mar 7, 2026 by
alphabetc1
Loading…
5 tasks
[Qwen-VL] Respect HF processor's default max_pixels for Qwen-VL
#20099
opened Mar 7, 2026 by
xingsy97
Loading…
3 of 5 tasks
Fix spec NaN/OOB detection: skip during CUDA graph capture, sync check otherwise
#20092
opened Mar 7, 2026 by
alisonshao
Loading…
1 of 2 tasks
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.