Skip to content

Commit 7ff8b94

Browse files
xinli-swnvpohanh
authored andcommitted
Add missing args for flashinfer 0.2.10
Signed-off-by: Xin Li. <[email protected]> Signed-off-by: XIn Li <[email protected]> Signed-off-by: Po-Han Huang <[email protected]>
1 parent ba6c28c commit 7ff8b94

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/model_executor/layers/quantization/modelopt.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1286,9 +1286,14 @@ def apply(
12861286
gemm1_weights=layer.gemm1_weights_fp4_shuffled.data,
12871287
gemm1_weights_scale=layer.gemm1_scales_fp4_shuffled.data.view(
12881288
torch.float8_e4m3fn),
1289+
gemm1_bias=None,
1290+
gemm1_alpha=None,
1291+
gemm1_beta=None,
1292+
gemm1_clamp_limit=None,
12891293
gemm2_weights=layer.gemm2_weights_fp4_shuffled.data,
12901294
gemm2_weights_scale=layer.gemm2_scales_fp4_shuffled.data.view(
12911295
torch.float8_e4m3fn),
1296+
gemm2_bias=None,
12921297
output1_scale_scalar=layer.g1_scale_c.data,
12931298
output1_scale_gate_scalar=layer.g1_alphas.data,
12941299
output2_scale_scalar=layer.g2_alphas.data,

0 commit comments

Comments
 (0)