You will need the following installed on your system:
This application is designed to run in a Docker container while Ollama runs on the host machine. The application will automatically retry connecting to Ollama up to 10 times during startup.
-
Install Ollama on the host machine (if not already installed):
- Download from https://ollama.ai/download
- Follow the installation instructions for your platform
-
Start Ollama on the host machine:
ollama serve
-
Verify Ollama is accessible (optional but recommended):
python startup_check.py
This script will check if Ollama is accessible from within a container context.
-
Pull required models (if needed):
ollama pull llama3:8b
This application uses token-based authentication to secure access to the LLM endpoints. API tokens are stored as environment variables in JSON format and must be included in request headers using the standard Bearer token format.
-
Generate API tokens using the provided utility:
python generate_api_key.py
This will generate secure API tokens and provide setup instructions.
-
Configure API tokens using one of these methods:
Option A: Using a .env file
# Create a .env file in the project root echo 'API_KEYS=[{"appname":"APP1","key":"your-token-here"},{"appname":"WEBAPP","key":"another-token-here"}]' > .env
Option B: Using system environment variables
export API_KEYS='[{"appname":"APP1","key":"your-token-here"},{"appname":"WEBAPP","key":"another-token-here"}]'
The JSON format is designed to work with external environment variable management tools. Each object in the array contains an
appnameandkeyproperty. -
Using API tokens in requests:
Include the token in the
Authorizationheader using Bearer format:curl -X POST "http://host.docker.internal:5000/query" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer your-token-here" \ -d '{"query": "Hello, how are you?"}'
- Keep your API tokens secure and don't share them
- Use different tokens for different applications
- Rotate tokens periodically for better security
- Never commit API tokens to version control
- The health check endpoints (
/echo,/up,/health/ollama) do not require authentication
If you would like to update the api, please follow the instructions below.
-
Create a local virtual environment and activate it:
python -m venv .venv source .venv/bin/activate # on Windows use .venv\Scripts\activate
If you are using Anaconda, you can create a virtual environment with:
conda create -n prompt-gateway-dev-env python=3.13 conda activate prompt-gateway-dev-env
-
Install the dependencies for this package.
pip install -r requirements.txt
Important: Ensure Ollama is running on the host machine before starting the application. The app will automatically retry connecting to Ollama up to 10 times during startup.
For developer mode:
python app.py --host $HOST --port $PORTor
flask run --debugFor production mode:
python3 app.py --host $HOST --port $PORTWhen running in a Docker container, ensure that:
- Ollama is running on the host machine (not in the container)
- The container can access host.docker.internal:11434 (Ollama's default port)
- Docker network configuration allows host access
- API tokens are properly configured as environment variables
Example Docker run command:
docker run -p 5000:5000 \
-e API_KEY_APP1=your-token-here \
-e API_KEY_WEBAPP=another-token-here \
--network host your-app-imageDepending on your OS, you may need to allow Ollama to run on the loopback address in the environment. One place to configure this is /etc/systemd/system/ollama.service. A sample config is shows below:
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
Environment="OLLAMA_HOST=0.0.0.0"
[Install]
WantedBy=default.targetThe application provides several health check endpoints:
/up- Basic application health check/health/ollama- Check if Ollama is running and accessible/query- Test the LLM functionality (requires Ollama to be running and API token authentication)
If you encounter issues with Ollama connectivity:
-
Check if Ollama is running on the host:
curl http://host.docker.internal:11434/api/tags
-
Verify container can access host:
python startup_check.py
-
Check application logs for detailed error messages about Ollama connectivity issues.
-
Ensure the required model is available:
ollama list
-
Docker network issues: If the container cannot access host.docker.internal:11434, try:
- Using
--network hostflag - Or mapping the port:
-p 11434:11434
- Using
If you encounter authentication issues:
-
Verify API tokens are configured:
python -c "from config import get_api_keys; print(get_api_keys())" -
Check that the Authorization header is included in your requests
-
Ensure the token matches one of the configured environment variables