Skip to content

Add LLMKube to inference section#277

Open
Defilan wants to merge 1 commit intotensorchord:mainfrom
Defilan:add-llmkube
Open

Add LLMKube to inference section#277
Defilan wants to merge 1 commit intotensorchord:mainfrom
Defilan:add-llmkube

Conversation

@Defilan
Copy link

@Defilan Defilan commented Mar 5, 2026

Adding LLMKube to the inference section.

LLMKube is a Kubernetes operator for llama.cpp-native LLM inference with:

  • CRD-based model and inference service management
  • NVIDIA CUDA and Apple Silicon Metal GPU support
  • Multi-GPU layer sharding
  • Pre-flight memory validation
  • Helm chart, Prometheus metrics, OpenAI-compatible API
  • Apache 2.0 license

I'm the creator and maintainer.

Signed-off-by: Christopher Maher <chris@mahercode.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant