Skip to content

Commit c27b6e9

Browse files
committed
[Bugfix][PD] set max_completion_tokens=1 if req has this value
Signed-off-by: Abirdcfly <[email protected]>
1 parent 30ef30e commit c27b6e9

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

examples/online_serving/disaggregated_serving/disagg_proxy_demo.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -293,6 +293,8 @@ async def create_chat_completion(self, raw_request: Request):
293293
# add params to request
294294
kv_prepare_request = request.copy()
295295
kv_prepare_request["max_tokens"] = 1
296+
if "max_completion_tokens" in kv_prepare_request:
297+
kv_prepare_request["max_completion_tokens"] = 1
296298

297299
# prefill stage
298300
prefill_instance = self.schedule(self.prefill_cycler)

examples/online_serving/disaggregated_serving_p2p_nccl_xpyd/disagg_proxy_p2p_nccl_xpyd.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,8 @@ async def handle_request():
128128
prefill_request = original_request_data.copy()
129129
# change max_tokens = 1 to let it only do prefill
130130
prefill_request["max_tokens"] = 1
131+
if "max_completion_tokens" in prefill_request:
132+
prefill_request["max_completion_tokens"] = 1
131133

132134
global count
133135
global prefill_instances

0 commit comments

Comments
 (0)