Support for 70b by updating ctransformers

**You can use the 70b parameter model now as well, here is how I accomplished it:**

1. Downloaded the 70b parameter model I wanted from [https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main](https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main).  In my case, I chose 'llama-2-70b-chat.ggmlv3.q5_K_M.bin'.  None of my runs so far have used much more than 6-8GB of RAM.  You need to modify the 'config/config.yml' to point to your newly downloaded model.

2. Updated the [CTransformers package](https://github.com/marella/ctransformers) to the latest version which adds support for 70b (ctransformers-0.2.15 or higher): 
`poetry run pip install ctransformers --upgrade`

3. I also updated [langchain](https://github.com/langchain-ai/langchain) (and I had done this first but I'm not sure it's required): 
`poetry run pip install langchain --upgrade`

Now it runs!  Much slower (<1 minute became almost 10min).






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for 70b by updating ctransformers #12

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support for 70b by updating ctransformers #12

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions