fix(vertex_ai): add 'audio' to supported OpenAI params for TTS#21869
fix(vertex_ai): add 'audio' to supported OpenAI params for TTS#21869edwiniac wants to merge 2 commits intoBerriAI:mainfrom
Conversation
The `audio` parameter was missing from `VertexGeminiConfig.get_supported_openai_params()`, causing it to be filtered out before reaching `map_openai_params()`. This broke TTS functionality via `/v1/audio/speech` for vertex_ai Gemini models. The mapping logic in `map_openai_params()` already correctly transforms `audio` → `speechConfig`, but the parameter was never reaching that code because it wasn't in the supported list. Fixes BerriAI#21702
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a bug where the
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py | Adds "audio" to the supported OpenAI params list. The fix is correct for TTS models but is applied unconditionally, unlike the sibling GoogleAIStudioGeminiConfig which gates it on TTS models only. |
| tests/test_litellm/test_utils.py | Adds a unit test asserting "audio" is in supported params for the TTS model. Test is local-only (no network calls) and follows existing patterns in the test file. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Client sends /v1/audio/speech\nwith audio param"] --> B["get_optional_params()"]
B --> C["get_supported_openai_params()"]
C --> D{{"'audio' in\nsupported_params?"}}
D -->|"No (before fix)"| E["audio param filtered out"]
E --> F["map_openai_params() never\nsees audio param"]
F --> G["400 INVALID_ARGUMENT"]
D -->|"Yes (after fix)"| H["audio param kept"]
H --> I["map_openai_params()\naudio → speechConfig"]
I --> J["Vertex AI TTS succeeds"]
Last reviewed commit: 890b56e
| "logprobs", | ||
| "top_logprobs", | ||
| "modalities", | ||
| "audio", |
There was a problem hiding this comment.
audio added unconditionally for all models
The sibling GoogleAIStudioGeminiConfig (in litellm/llms/gemini/chat/transformation.py:98) conditionally adds "audio" only for TTS models via is_model_gemini_audio_model(model) (which checks "tts" in model). This PR adds "audio" unconditionally to VertexGeminiConfig for all Gemini models, which is inconsistent.
While this won't cause a runtime error (the mapping in map_openai_params handles it safely), it means get_supported_openai_params("gemini-1.5-pro") will report "audio" as supported even for non-TTS models, which is misleading.
Consider guarding this the same way GoogleAIStudioGeminiConfig does:
| "audio", | |
| "audio" if "tts" in model else None, |
Or better yet, add it conditionally after the list like the penalty params:
if "tts" in model:
supported_params.append("audio")Address review feedback: make audio param addition consistent with GoogleAIStudioGeminiConfig by only adding it when 'tts' is in model name.
|
Thanks for the review feedback! Updated to add |
Problem
/v1/audio/speechfails with400 INVALID_ARGUMENTwhen usingvertex_ai/Gemini TTS models.Root Cause
VertexGeminiConfig.get_supported_openai_params()does not include"audio"in its return list. Theaudioparameter is silently filtered out before reachingmap_openai_params(), which already has the correctaudio→speechConfigmapping.Fix
Add
"audio"to the list of supported OpenAI params inVertexGeminiConfig.get_supported_openai_params().Changes
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py: Added"audio"to supported params listtests/test_litellm/test_utils.py: Added test case to verify audio param is supportedTesting
The mapping logic already exists and works — this fix just ensures the param reaches it.
Fixes #21702