Skip to content

Commit 6b3feac

Browse files
committed
Fix test case
Signed-off-by: Hui Gao <huig@nvidia.com>
1 parent 8a83e02 commit 6b3feac

File tree

2 files changed

+193
-450
lines changed

2 files changed

+193
-450
lines changed

tensorrt_llm/_torch/pyexecutor/_util.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ def _cal_max_memory(self, peak_memory, total_gpu_memory, fraction) -> int:
119119
"""
120120
kv_size_per_token = self._get_kv_size_per_token()
121121

122-
available_kv_mem = total_gpu_memory * fraction - peak_memory
122+
available_kv_mem = (total_gpu_memory - peak_memory) * fraction
123123
logger.info(
124124
f"Peak memory during memory usage profiling (torch + non-torch): {peak_memory / (GB):.2f} GiB, "
125125
f"available KV cache memory when calculating max tokens: {available_kv_mem / (GB):.2f} GiB, "

0 commit comments

Comments
 (0)