-
Notifications
You must be signed in to change notification settings - Fork 81
Open
Copy link
Labels
internalfiled by core contributor or associatefiled by core contributor or associate
Milestone
Description
This is related to the ongoing work for multi turn chat, for this suggestiion we have multiple system prompts along with a set of questions that go with that system prompt. This has been implemented in the sglang bench_serving.py modifications of the vllm version of this script (https://github.com/sgl-project/sglang/blob/main/python/sglang/bench_serving.py). The configuration knobs are listed below.
group.add_argument(
"--gsp-num-groups",
type=int,
default=64,
help="Number of system prompt groups for generated-shared-prefix dataset",
)
group.add_argument(
"--gsp-prompts-per-group",
type=int,
default=16,
help="Number of prompts per system prompt group for generated-shared-prefix dataset",
)
group.add_argument(
"--gsp-system-prompt-len",
type=int,
default=2048,
help="Target length in tokens for system prompts in generated-shared-prefix dataset",
)
group.add_argument(
"--gsp-question-len",
type=int,
default=128,
help="Target length in tokens for questions in generated-shared-prefix dataset",
)
group.add_argument(
"--gsp-output-len",
type=int,
default=256,
help="Target length in tokens for outputs in generated-shared-prefix dataset",
Metadata
Metadata
Assignees
Labels
internalfiled by core contributor or associatefiled by core contributor or associate