-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
vLLM 0.15.0 and 0.15.1 both failed on H200 instance with DeepGEMM assertion error
Hi Team,
We’re using v0.15.0, and it failed to run on a H200 instance because of the DeepGEMM assertion error: RuntimeError: Assertion error (csrc/apis/../jit_kernels/impls/../../jit/kernel_runtime.hpp:45): exit_code == 0
DeepGEMM's JIT kernel is crashing when vLLM tries to run FP8 GEMM operations (fp8_gemm_nt) during startup profiling.
Can you help take a look ?
🐛 Describe the bug
0.15.1
4:57:02 PM [algo-1-1772841836] [vllm.server] Traceback (most recent call last):
4:57:02 PM [algo-1-1772841836] [vllm.server] RuntimeError: Worker failed with error 'Assertion error (csrc/apis/../jit_kernels/impls/../../jit/kernel_runtime.hpp:45): exit_code == 0', please check the stack trace above for the root cause
4:57:03 PM [algo-1-1772841836] [vllm.server] Traceback (most recent call last):
4:57:03 PM [algo-1-1772841836] [vllm.server] RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}
4:57:10 PM [algo-1-1772841836] [VLLM_STATUS] ✗ Configuration failed: VLLM: DP=1, TP=8, KV=0.7, eager
4:57:10 PM [algo-1-1772841836] [VLLM_STATUS] ✗ All vLLM configurations failed
4:57:10 PM [algo-1-1772841836] /opt/wrapper/libfarm/lib/python3.11/site-packages/watchtower/__init__.py:464: WatchtowerWarning: Received message after logging system shutdown warnings.warn("Received message after logging system shutdown", WatchtowerWarning)
4:57:10 PM [algo-1-1772841836] [VLLM_STATUS] Stopping server server (PID: 68574)
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working