Add OpenAI-compatible backends (closes #1)#2
Conversation
- OpenAICompatibleBackend: generic backend for OpenRouter, Groq, Grok, Cerebras, Gemini, and any custom OpenAI-compatible endpoint. Provider presets wire up base_url, API key env var, and default model. - GeminiServiceAccountBackend: exchanges a service account JSON via google-auth for a Bearer token, then delegates to OpenAICompatibleBackend at Gemini's OpenAI-compat endpoint. - get_backend() now handles openrouter, groq, grok, cerebras, gemini-sa, openai-compat. - cli.py: 6 new --backend choices + --base-url flag for custom endpoints. - watch.py: threads base_url through to get_backend() via backend_kwargs. - pyproject.toml: new google-auth optional group, added to [all]. - .env.example: documented all new provider env vars. - 30 new tests in test_openai_compat_backend.py (all mocked). - README: updated backends table and CLI usage examples. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mnvsk97
left a comment
There was a problem hiding this comment.
Review: Doesn't fit the codebase yet — needs a rethink
The idea is solid (OpenAI-compat API is the universal adapter), but the execution duplicates existing code and has a runtime bug. Here's what needs to change:
1. OpenAICompatibleBackend is a copy-paste of OpenAIBackend
analyze_image, analyze_video, analyze_audio, generate are line-for-line identical to OpenAIBackend. The only difference is the constructor.
Fix: Add base_url and api_key_env params to the existing OpenAIBackend + the provider preset dict. That's ~20 lines, not a new 100-line class.
2. GeminiServiceAccountBackend is redundant
GeminiBackend already handles service accounts via _load_service_account() — it checks GOOGLE_APPLICATION_CREDENTIALS, loads credentials, and creates a Vertex AI client with the native Gemini SDK (which supports direct video upload + audio).
GeminiServiceAccountBackend re-implements this but worse — it goes through the OpenAI-compat endpoint, losing supports_video=True and native audio. It's a strict downgrade. Drop the whole class + gemini-sa backend name.
3. supports_audio = True is a bug for most providers
analyze_audio calls client.audio.transcriptions.create(model="whisper-1") — that's OpenAI's Whisper endpoint. Groq, Cerebras, Grok, and OpenRouter don't have this. This will crash at runtime when the orchestrator tries audio transcription for these backends.
4. "gemini" in the provider preset map is confusing
_OPENAI_COMPAT_PROVIDERS includes "gemini", but there's already a GeminiBackend using the native SDK. If someone hits this path they get degraded Gemini (no video upload, no native audio). Remove it — native GeminiBackend is strictly better.
5. 593 lines for what should be ~40-60
The minimal version:
- Add
base_url,api_key_envparams toOpenAIBackend.__init__ - Add the
_OPENAI_COMPAT_PROVIDERSdict (~10 lines) - Wire the factory + CLI (~10 lines)
- Set
supports_audio = Falsefor providers without Whisper - Drop
GeminiServiceAccountBackendentirely
The core insight — "most vision APIs speak OpenAI protocol, so one backend + a preset dict covers them all" — is correct and welcome. The execution just needs to build on what's already there instead of duplicating it.
- Drop OpenAICompatibleBackend and GeminiServiceAccountBackend (the latter is already handled better by the native GeminiBackend + GOOGLE_APPLICATION_CREDENTIALS) - Extend OpenAIBackend with base_url/api_key_env/has_whisper params so all OpenAI-compatible providers share one class without duplication - Add _OPENAI_COMPAT_PROVIDERS preset dict (4 providers, ~10 lines) - has_whisper=False for all compat providers — only OpenAI proper has Whisper - Remove "gemini" from compat presets (native GeminiBackend is strictly better) - Drop google-auth optional dependency group (no longer needed) - Update tests and regex matches for new error messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
OpenAICompatibleBackend: generic backend for any OpenAI-compatible API. Provider presets (openrouter,groq,grok,cerebras,gemini) wire up the correct base URL, API key env var, and default model automatically. Also supports arbitrary custom endpoints viabase_url=.GeminiServiceAccountBackend: exchanges a service account JSON (GOOGLE_APPLICATION_CREDENTIALS) for a short-lived Bearer token viagoogle-auth, then delegates toOpenAICompatibleBackendat Gemini's OpenAI-compat endpoint.get_backend()updated to handle all new names:openrouter,groq,grok,cerebras,gemini-sa,openai-compat.cli.py: 6 new--backendchoices +--base-urlflag for custom endpoints (auto-selectsopenai-compatwhen only--base-urlis given).watch.py: threadsbase_urlthrough toget_backend().pyproject.toml: newgoogle-authoptional group, added to[all]..env.example: documents all new provider env vars.tests/test_openai_compat_backend.py— all mocked, no real API calls.README.md: updated backends table and CLI usage examples.Test plan
pytest tests/test_openai_compat_backend.py— 30 new tests, all passingpytest tests/test_backend.py— 33 existing backend tests, all still passingpytest— full suite: 188 passed, 0 failed🤖 Generated with Claude Code