vramio

Know your VRAM before you run.

A dead-simple API to estimate GPU memory requirements for any HuggingFace model.

The Problem

You found a cool model on HuggingFace. Now what?

"Will it fit on my 24GB GPU?"
"What quantization do I need?"
"How much VRAM for inference?"

The answers are buried — scattered across model cards, config files, or simply missing. You either dig through safetensors metadata yourself, or download the model and pray.

The Solution

One API call. Instant answer.

curl "https://vramio.ksingh.in/model?hf_id=meta-llama/Llama-2-7b"

{
  "model": "meta-llama/Llama-2-7b",
  "total_parameters": "6.74B",
  "memory_required": "12.55 GB",
  "current_dtype": "F16",
  "recommended_vram": "15.06 GB",
  "other_precisions": {
    "fp32": "25.10 GB",
    "fp16": "12.55 GB",
    "int8": "6.27 GB",
    "int4": "3.14 GB"
  },
  "overhead_note": "Includes 20% for activations/KV cache (2K context)"
}

recommended_vram = what you actually need (includes 20% overhead for inference).

How It Works

Fetches safetensors metadata from HuggingFace (just headers, not weights)
Parses tensor shapes and dtypes
Calculates memory for each precision
Adds 20% overhead for activations + KV cache

No model downloads. No GPU required. Just math.

Read more about the implementation in this blog post.

Self-Host

# Clone and run
git clone https://github.com/ksingh-scogo/vramio.git
cd vramio
pip install httpx[http2]
python server_embedded.py

Or deploy free on Render using the included render.yaml.

Tech Stack

160 lines of Python
Zero frameworks — just stdlib http.server + httpx
1 dependency — httpx[http2]

Credits

Built on memory estimation logic from hf-mem by @alvarobartt.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
render.yaml		render.yaml
requirements.txt		requirements.txt
server_embedded.py		server_embedded.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vramio

The Problem

The Solution

How It Works

Self-Host

Tech Stack

Credits

License

About

Uh oh!

Contributors 2

Uh oh!

Languages

ksingh-scogo/vramio

Folders and files

Latest commit

History

Repository files navigation

vramio

The Problem

The Solution

How It Works

Self-Host

Tech Stack

Credits

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages