You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/user_guide/configuration/additional_config.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -58,6 +58,8 @@ The details of each config option are as follows:
58
58
| Name | Type | Default | Description |
59
59
| ---- | ---- | ------- | ----------- |
60
60
|`enabled`| bool |`False`| Whether to enable ascend scheduler for V1 engine|
61
+
|`enable_pd_transfer`| bool |`False`| Whether to enable pd transfer. When using it, decode is started only when prefill of all requests is done. This option only takes effects on offline inference. |
62
+
|`decode_max_num_seqs`| int |`0`| Whether to change max_num_seqs of decode phase when enable pd transfer. This option only takes effects when enable_pd_transfer is True. |
61
63
62
64
ascend_scheduler_config also support the options from [vllm scheduler config](https://docs.vllm.ai/en/stable/api/vllm/config.html#vllm.config.SchedulerConfig). For example, you can add `enable_chunked_prefill: True` to ascend_scheduler_config as well.
0 commit comments