Different between memory usage reported by driver and inside container

**What happened**: the memory consumption reported by driver and hami is different

**What you expected to happen**: reported memory consumption on both hami and driver are the same

**How to reproduce it (as minimally and precisely as possible)**:
deploy gpt2 with vllm, each container consume 15k MiB of GPU DRAM
**Anything else we need to know?**:

- The output of `nvidia-smi -a` on vllm container:

<img width="566" height="617" alt="Image" src="https://github.com/user-attachments/assets/00a844ed-05e3-4b8a-af76-43721777fcc5" />

- The output of `nvidia-smi -a` on host:

<img width="553" height="727" alt="Image" src="https://github.com/user-attachments/assets/21e04107-2967-4291-a6e9-de1c44af9884" />

**Environment**:
- HAMi version: 2.6.0
- nvidia driver or other AI device driver version:
- Docker version from `docker version`
- Docker command, image and tag used
- Kernel version from `uname -a`
- Others:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Different between memory usage reported by driver and inside container #1196

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Different between memory usage reported by driver and inside container #1196

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions