Skip to content

Add OpenAI-compatible backends (closes #1)#2

Merged
mnvsk97 merged 2 commits intomainfrom
feature/openai-compatible-backends
Apr 13, 2026
Merged

Add OpenAI-compatible backends (closes #1)#2
mnvsk97 merged 2 commits intomainfrom
feature/openai-compatible-backends

Conversation

@diveshjain2019-dev
Copy link
Copy Markdown
Collaborator

Summary

  • OpenAICompatibleBackend: generic backend for any OpenAI-compatible API. Provider presets (openrouter, groq, grok, cerebras, gemini) wire up the correct base URL, API key env var, and default model automatically. Also supports arbitrary custom endpoints via base_url=.
  • GeminiServiceAccountBackend: exchanges a service account JSON (GOOGLE_APPLICATION_CREDENTIALS) for a short-lived Bearer token via google-auth, then delegates to OpenAICompatibleBackend at Gemini's OpenAI-compat endpoint.
  • get_backend() updated to handle all new names: openrouter, groq, grok, cerebras, gemini-sa, openai-compat.
  • cli.py: 6 new --backend choices + --base-url flag for custom endpoints (auto-selects openai-compat when only --base-url is given).
  • watch.py: threads base_url through to get_backend().
  • pyproject.toml: new google-auth optional group, added to [all].
  • .env.example: documents all new provider env vars.
  • 30 new tests in tests/test_openai_compat_backend.py — all mocked, no real API calls.
  • README.md: updated backends table and CLI usage examples.

Test plan

  • pytest tests/test_openai_compat_backend.py — 30 new tests, all passing
  • pytest tests/test_backend.py — 33 existing backend tests, all still passing
  • pytest — full suite: 188 passed, 0 failed

🤖 Generated with Claude Code

- OpenAICompatibleBackend: generic backend for OpenRouter, Groq, Grok,
  Cerebras, Gemini, and any custom OpenAI-compatible endpoint.
  Provider presets wire up base_url, API key env var, and default model.
- GeminiServiceAccountBackend: exchanges a service account JSON via
  google-auth for a Bearer token, then delegates to OpenAICompatibleBackend
  at Gemini's OpenAI-compat endpoint.
- get_backend() now handles openrouter, groq, grok, cerebras, gemini-sa,
  openai-compat.
- cli.py: 6 new --backend choices + --base-url flag for custom endpoints.
- watch.py: threads base_url through to get_backend() via backend_kwargs.
- pyproject.toml: new google-auth optional group, added to [all].
- .env.example: documented all new provider env vars.
- 30 new tests in test_openai_compat_backend.py (all mocked).
- README: updated backends table and CLI usage examples.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner

@mnvsk97 mnvsk97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Doesn't fit the codebase yet — needs a rethink

The idea is solid (OpenAI-compat API is the universal adapter), but the execution duplicates existing code and has a runtime bug. Here's what needs to change:


1. OpenAICompatibleBackend is a copy-paste of OpenAIBackend

analyze_image, analyze_video, analyze_audio, generate are line-for-line identical to OpenAIBackend. The only difference is the constructor.

Fix: Add base_url and api_key_env params to the existing OpenAIBackend + the provider preset dict. That's ~20 lines, not a new 100-line class.

2. GeminiServiceAccountBackend is redundant

GeminiBackend already handles service accounts via _load_service_account() — it checks GOOGLE_APPLICATION_CREDENTIALS, loads credentials, and creates a Vertex AI client with the native Gemini SDK (which supports direct video upload + audio).

GeminiServiceAccountBackend re-implements this but worse — it goes through the OpenAI-compat endpoint, losing supports_video=True and native audio. It's a strict downgrade. Drop the whole class + gemini-sa backend name.

3. supports_audio = True is a bug for most providers

analyze_audio calls client.audio.transcriptions.create(model="whisper-1") — that's OpenAI's Whisper endpoint. Groq, Cerebras, Grok, and OpenRouter don't have this. This will crash at runtime when the orchestrator tries audio transcription for these backends.

4. "gemini" in the provider preset map is confusing

_OPENAI_COMPAT_PROVIDERS includes "gemini", but there's already a GeminiBackend using the native SDK. If someone hits this path they get degraded Gemini (no video upload, no native audio). Remove it — native GeminiBackend is strictly better.

5. 593 lines for what should be ~40-60

The minimal version:

  • Add base_url, api_key_env params to OpenAIBackend.__init__
  • Add the _OPENAI_COMPAT_PROVIDERS dict (~10 lines)
  • Wire the factory + CLI (~10 lines)
  • Set supports_audio = False for providers without Whisper
  • Drop GeminiServiceAccountBackend entirely

The core insight — "most vision APIs speak OpenAI protocol, so one backend + a preset dict covers them all" — is correct and welcome. The execution just needs to build on what's already there instead of duplicating it.

- Drop OpenAICompatibleBackend and GeminiServiceAccountBackend (the latter
  is already handled better by the native GeminiBackend + GOOGLE_APPLICATION_CREDENTIALS)
- Extend OpenAIBackend with base_url/api_key_env/has_whisper params so all
  OpenAI-compatible providers share one class without duplication
- Add _OPENAI_COMPAT_PROVIDERS preset dict (4 providers, ~10 lines)
- has_whisper=False for all compat providers — only OpenAI proper has Whisper
- Remove "gemini" from compat presets (native GeminiBackend is strictly better)
- Drop google-auth optional dependency group (no longer needed)
- Update tests and regex matches for new error messages

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mnvsk97 mnvsk97 merged commit bcc1233 into main Apr 13, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants