-
Notifications
You must be signed in to change notification settings - Fork 461
Description
Required Pre-requisites
- I have read the Documentation
- I have searched the Issue Tracker and Discussions that this hasn't been reported yet.
- Consider asking in Discussions first
Motivation
Be able to use different openai compatible providers or services.
Example using azure openai or a self hosted inference services (openai compatible) like lmstudio, ollama. Be able to change the embedding model if needed.
Proposed Solution
Proposed Solution:
The suggested solution will be through abstracting the creation of the openai client to a new singleton factory method in the common folder, e.g. "openai_client.py".
The OpenAI client support passing additional argument to control the non default base_url, azure endpoint, api version, project, organization.
- The singleton factory can utilize the additional OpenAI variables while creating the client and abstracted from the other python files that utilize the client.
- Abstract caching of multiple clients based on their API Keys, to support CLI vs Server clients (assuming we connect to the same inference server, but using different application keys )
- Add an extra environment to specify which client_type we are creating (
openaivsazure) to create the client using OpenAI or AzureOpenAI.
Advantages:
- Adding an easy configurable way for switching the OpenAI providers
- Developers with no OpenAI access can use local hosted services like Ollama / LMStudio / other providers for local development.
Notes for Local hosted openai provider services and docker compose:
- If you are consideration using non openai models, make sure you select a model that support 1024 to support the postgres/pgvector needs.
- If you are using ollama/Lmstudio/self hosted provider, you need to make it accessible to your backend server within the docker network.
- For Ollama if you use docker compose for development, you may need to run ollama after setting this environment variable on your machine (export OLLAMA_HOST="0.0.0.0:11434") to be able to access it form non localhost ip.
- For LMStudio: you will need to enable "serve on Local Network" to listen to non localhost ip -> 0.0.0.0:1234
I already made changes locally to test running the backend server with Azure OpenAI and connected to Ollama to use a different model ("mxbai-embed-large").
For using non-standard openai services embeddings with pgvector, you need to choose a compatible embedding model that support dimension = 1024
# if using OpenAI, you can use these variables in `.env.local`
OPENAI_CLIENT_TYPE=openai
# OpenAI Org id
# OPENAI_ORG_ID=
# OpenAI Project id
# OPENAI_PROJECT_ID=
## Keys
SERVER_OPENAI_API_KEY=<your openai key>
CLI_OPENAI_API_KEY=<your openai key>
## if using Azure, you can use these variables in `.env.local`
OPENAI_CLIENT_TYPE=azure
AZURE_OPENAI_ENDPOINT=https://<your-resource>.openai.azure.com/
OPENAI_API_VERSION=2024-02-01
## Keys
SERVER_OPENAI_API_KEY=<your openai key>
CLI_OPENAI_API_KEY=<your openai key>
# if using OpenAI compatible api utilize the base url (e.g. OLLAMA) , you can use these variables in `.env.local`
OPENAI_CLIENT_TYPE=openai
OPENAI_BASE_URL=http://localhost:11434/v1
## Keys
SERVER_OPENAI_API_KEY=<your openai key>
CLI_OPENAI_API_KEY=<your openai key>