-
Notifications
You must be signed in to change notification settings - Fork 138
Closed
Description
try to use guidellm to test vllm benchmark (with litellm proxy), failed
1 setup vllm service
vllm serve /root/data/Qwen/Qwen2-0.5B-Instruct --port 8001 --served-model-name Qwen2-0.5B-Instruct
2 setup litellm proxy
litellm_config.yaml
model_list:
- model_name: "vllm-model-group-1"
litellm_params:
model: "hosted_vllm/Qwen2-0.5B-Instruct"
api_base: "http://vllmserverIP:8001/v1"
request_timeout: 120
litellm --config config.yaml --port 4000
3 use curl to access vllm service and litellm proxy directly , both works fine.
4 run guidellm , failed
guidellm benchmark --target "http://localhost:4000" --rate-type sweep --max-seconds 30 --data "prompt_tokens=5,output_tokens=2"
guidellm benchmark --target "http://localhost:4000" --rate-type sweep --data "prompt_tokens=5,output_tokens=2"
Creating backend...
25-09-09 01:09:14|ERROR |guidellm.backend.openai:text_completions:306 - OpenAIHTTPBackend request with headers: {'Content-Type': 'application/json'} and params: {} and payload: {'prompt': 'Test connection', 'model': 'vllm-model-group-1', 'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 1, 'max_completion_tokens': 1, 'stop': None, 'ignore_eos': True} failed: Server error '500 Internal Server Error' for url 'http://localhost:4000/v1/completions'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Traceback (most recent call last):
File "/usr/local/bin/guidellm", line 7, in <module>
sys.exit(cli())
^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/guidellm/__main__.py", line 314, in run
asyncio.run(
File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/guidellm/benchmark/entrypoints.py", line 29, in benchmark_with_scenario
return await benchmark_generative_text(**vars(scenario), **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/guidellm/benchmark/entrypoints.py", line 71, in benchmark_generative_text
await backend.validate()
File "/usr/local/lib/python3.12/dist-packages/guidellm/backend/backend.py", line 143, in validate
async for _ in self.text_completions( # type: ignore[attr-defined]
File "/usr/local/lib/python3.12/dist-packages/guidellm/backend/openai.py", line 314, in text_completions
raise ex
File "/usr/local/lib/python3.12/dist-packages/guidellm/backend/openai.py", line 295, in text_completions
async for resp in self._iterative_completions_request(
File "/usr/local/lib/python3.12/dist-packages/guidellm/backend/openai.py", line 605, in _iterative_completions_request
stream.raise_for_status()
File "/usr/local/lib/python3.12/dist-packages/httpx/_models.py", line 829, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:4000/v1/completions'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels