Skip to content

Kithara v0.0.10 Release

Latest

Choose a tag to compare

@wenxindongwork wenxindongwork released this 25 Mar 15:59

Highlights

  • Llama3.1 is now supported -- Test it out by using the "hf://meta-llama/Llama-3.1-8B" model handle :)
  • MaxText model inference with KV cache -- MaxTextModel.generate("hi") now is much faster!
  • Serve with vLLM TPUs or GPUs -- Check our docs to see how to serve Kithara tuned models to vLLM

What's Changed