Skip to content

Commit 4b7ec5f

Browse files
LucasWilkinsonamd-xiaoyu12
authored andcommitted
[Perf] Dont create unnecessary pooling params (vllm-project#22876)
Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: Xiao Yu <[email protected]>
1 parent 8c271d3 commit 4b7ec5f

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -341,13 +341,13 @@ def _init_model_kwargs(self, num_tokens: int):
341341
model_kwargs = dict[str, Any]()
342342
num_reqs = self.input_batch.num_reqs
343343

344-
pooling_params = self.input_batch.pooling_metadata.pooling_params
345-
346-
num_pooling_reqs = len(pooling_params)
344+
num_pooling_reqs = len(self.input_batch.pooling_params)
347345

348346
if num_pooling_reqs == 0:
349347
return model_kwargs
350348

349+
pooling_params = self.input_batch.pooling_metadata.pooling_params
350+
351351
assert num_pooling_reqs == num_reqs
352352

353353
token_type_id_requests = dict[int, Any]()

0 commit comments

Comments
 (0)