Skip to content

Commit ca00270

Browse files
committed
fix: Update Google Gemini pipeline for enhanced model compatibility and thinking level validation
1 parent 3ccac34 commit ca00270

File tree

2 files changed

+142
-79
lines changed

2 files changed

+142
-79
lines changed

docs/google-gemini-integration.md

Lines changed: 57 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -167,9 +167,9 @@ GOOGLE_INCLUDE_THOUGHTS=true
167167
GOOGLE_THINKING_BUDGET=-1
168168

169169
# Thinking level for Gemini 3 models only
170-
# Valid values: "low", "high", or empty string for model default
171-
# - "low": Minimizes latency and cost, suitable for simple tasks
172-
# - "high": Maximizes reasoning depth, ideal for complex problem-solving
170+
# Most Gemini 3 models accept "low" or "high"
171+
# gemini-3.1-flash-image-preview accepts "minimal" or "high"
172+
# The pipeline automatically maps unsupported values to the closest supported level
173173
# Default: "" (empty, uses model default)
174174
# Note: This setting is ignored for non-Gemini 3 models
175175
GOOGLE_THINKING_LEVEL=""
@@ -229,10 +229,10 @@ VERTEX_AI_RAG_STORE="projects/your-project/locations/global/collections/default_
229229
230230
## Image Generation Configuration
231231

232-
The Google Gemini pipeline supports configurable aspect ratios and resolutions for image generation with **Gemini 3 image models** (e.g., `gemini-3-pro-image-preview`, `gemini-3-flash-image-preview`).
232+
The Google Gemini pipeline supports configurable aspect ratios and resolutions for image generation with **Gemini 3/3.1 image models** (e.g., `gemini-3.1-flash-image-preview`, `gemini-3-pro-image-preview`, `gemini-3-flash-image-preview`).
233233

234234
> [!IMPORTANT]
235-
> **Model Compatibility**: The `aspect_ratio` and `image_size` parameters (ImageConfig) are **only supported by Gemini 3 image models**. Gemini 2.5 image models (e.g., `gemini-2.5-flash-image-preview`) support image generation but do not support these configuration parameters. When using Gemini 2.5 image models, default aspect ratio and resolution will be used automatically.
235+
> **Model Compatibility**: The `aspect_ratio` and `image_size` parameters (ImageConfig) are **only supported by Gemini 3/3.1 image models**. Gemini 2.5 image models (e.g., `gemini-2.5-flash-image-preview`) support image generation but do not support these configuration parameters. When using Gemini 2.5 image models, default aspect ratio and resolution will be used automatically.
236236
237237
### Aspect Ratio
238238

@@ -334,13 +334,14 @@ for part in response.parts:
334334

335335
### Model Compatibility
336336

337-
| Model | ImageConfig Support (aspect_ratio, image_size) |
338-
| ------------------------- | ----------------------------------------------- |
339-
| gemini-3-pro-image-\* | ✅ Supported |
340-
| gemini-3-flash-image-\* | ✅ Supported |
341-
| gemini-2.5-flash-image-\* | ❌ Not supported (uses defaults) |
342-
| Other gemini-3-\* models | ❌ Not image generation models |
343-
| Other models | ❌ Not image generation models |
337+
| Model | ImageConfig Support (aspect_ratio, image_size) |
338+
| --------------------------- | ---------------------------------------------- |
339+
| gemini-3.1-flash-image-\* | ✅ Supported |
340+
| gemini-3-pro-image-\* | ✅ Supported |
341+
| gemini-3-flash-image-\* | ✅ Supported |
342+
| gemini-2.5-flash-image-\* | ❌ Not supported (uses defaults) |
343+
| Other gemini-3 / gemini-3.1 | ❌ Not image generation models |
344+
| Other models | ❌ Not image generation models |
344345

345346
## Video Generation Configuration
346347

@@ -351,32 +352,32 @@ The Google Gemini pipeline supports video generation using **Google Veo models**
351352
352353
### Supported Models
353354

354-
| Model ID | Description |
355-
| --------------------------------- | ------------------------------------- |
356-
| `veo-3.1-generate-preview` | Veo 3.1 — highest quality, 4k, reference images |
357-
| `veo-3.1-fast-generate-preview` | Veo 3.1 Fast — faster generation |
358-
| `veo-3-generate-preview` | Veo 3 — balanced quality |
359-
| `veo-3.0-fast-generate-001` | Veo 3 Fast |
360-
| `veo-2.0-generate-001` | Veo 2 — legacy model |
355+
| Model ID | Description |
356+
| ------------------------------- | ----------------------------------------------- |
357+
| `veo-3.1-generate-preview` | Veo 3.1 — highest quality, 4k, reference images |
358+
| `veo-3.1-fast-generate-preview` | Veo 3.1 Fast — faster generation |
359+
| `veo-3-generate-preview` | Veo 3 — balanced quality |
360+
| `veo-3.0-fast-generate-001` | Veo 3 Fast |
361+
| `veo-2.0-generate-001` | Veo 2 — legacy model |
361362

362363
### Per-Model Feature Support
363364

364365
Not all parameters are supported by every Veo model. The pipeline automatically gates features based on the model used. Unsupported parameters are silently skipped to avoid API errors.
365366

366-
| Feature | Veo 3.1 | Veo 3.1 Fast | Veo 3 | Veo 3 Fast | Veo 2 |
367-
| -------------------- | ----------------- | ----------------- | ----------------- | ----------------- | ----------------- |
368-
| Aspect Ratio | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 |
369-
| Resolution | 720p, 1080p, 4k | 720p, 1080p, 4k | 720p, 1080p | 720p, 1080p | |
370-
| Duration (seconds) | 4, 6, 8 | 4, 6, 8 | 8 only | 8 only | 5, 6, 8 |
371-
| Negative Prompt | Yes | Yes | Yes | Yes | Yes |
372-
| Person Generation | Yes | Yes | Yes | Yes | Yes |
373-
| Enhance Prompt | Yes | | Yes | | |
374-
| Image-to-Video | Yes | Yes | Yes | Yes | Yes |
375-
| Reference Images | ⚠️ API only¹ | ⚠️ API only¹ | | | |
376-
| Last Frame (interp.) | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² |
377-
| Video Extension | ⚠️ Not yet² | ⚠️ Not yet² | | | |
378-
| Audio | Native | Native | Native | Native | Silent only |
379-
| Max Videos/Request | 1 | 1 | 1 | 1 | 2 |
367+
| Feature | Veo 3.1 | Veo 3.1 Fast | Veo 3 | Veo 3 Fast | Veo 2 |
368+
| -------------------- | --------------- | --------------- | ----------- | ----------- | ----------- |
369+
| Aspect Ratio | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 | 16:9, 9:16 |
370+
| Resolution | 720p, 1080p, 4k | 720p, 1080p, 4k | 720p, 1080p | 720p, 1080p ||
371+
| Duration (seconds) | 4, 6, 8 | 4, 6, 8 | 8 only | 8 only | 5, 6, 8 |
372+
| Negative Prompt | Yes | Yes | Yes | Yes | Yes |
373+
| Person Generation | Yes | Yes | Yes | Yes | Yes |
374+
| Enhance Prompt | Yes || Yes |||
375+
| Image-to-Video | Yes | Yes | Yes | Yes | Yes |
376+
| Reference Images | ⚠️ API only¹ | ⚠️ API only¹ ||||
377+
| Last Frame (interp.) | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² | ⚠️ Not yet² |
378+
| Video Extension | ⚠️ Not yet² | ⚠️ Not yet² ||||
379+
| Audio | Native | Native | Native | Native | Silent only |
380+
| Max Videos/Request | 1 | 1 | 1 | 1 | 2 |
380381

381382
> ¹ The Veo API supports up to 3 reference images for Veo 3.1, but the pipeline currently only forwards a single attached image via the `image` parameter.
382383
>
@@ -568,8 +569,9 @@ When enabled, sources and google queries from the search used by Gemini will be
568569
The pipeline supports **Enterprise Web Search** for grounding, which provides organization-level management of search results.
569570

570571
To enable Enterprise Search:
571-
1. Set `GOOGLE_USE_ENTERPRISE_SEARCH=true` (or toggle the Valve in the UI).
572-
2. Ensure `GOOGLE_GENAI_USE_VERTEXAI=true` (Enterprise Search is a Vertex AI feature).
572+
573+
1. Set `GOOGLE_USE_ENTERPRISE_SEARCH=true` (or toggle the Valve in the UI).
574+
2. Ensure `GOOGLE_GENAI_USE_VERTEXAI=true` (Enterprise Search is a Vertex AI feature).
573575

574576
When enabled, the pipeline will use the `enterprise_web_search` tool instead of the standard `google_search` tool whenever grounding is requested.
575577

@@ -672,25 +674,30 @@ The Google Gemini pipeline supports advanced thinking configuration to control h
672674

673675
Gemini 3 models support the `thinking_level` parameter, which controls the depth of reasoning:
674676

675-
- **`"low"`**: Minimizes latency and cost, suitable for simple tasks, chat, or high-throughput APIs.
676-
- **`"high"`**: Maximizes reasoning depth, ideal for complex problem-solving, code analysis, and agentic workflows.
677+
- **Most Gemini 3 models**: support **`"low"`** and **`"high"`**.
678+
- **`gemini-3.1-flash-image-preview`**: supports **`"minimal"`** and **`"high"`**.
677679

678680
> [!Note]
679681
> Gemini 3 models use `thinking_level` and do **not** use `thinking_budget`. The thinking budget setting is ignored for Gemini 3 models.
680682
683+
If you configure an unsupported value for a specific model, the pipeline automatically falls back to the closest supported thinking level instead of sending an invalid API request.
684+
681685
Set via environment variable:
682686

683687
```bash
684-
# Use low thinking level for faster responses
688+
# Use low thinking level for most Gemini 3 models
685689
GOOGLE_THINKING_LEVEL="low"
686690

687691
# Use high thinking level for complex reasoning
688692
GOOGLE_THINKING_LEVEL="high"
693+
694+
# Use minimal thinking level for gemini-3.1-flash-image-preview
695+
GOOGLE_THINKING_LEVEL="minimal"
689696
```
690697

691698
#### Per-Chat Override (Reasoning Effort)
692699

693-
The per-chat `reasoning_effort` value can override the environment-level `GOOGLE_THINKING_LEVEL` setting. When a chat specifies a `reasoning_effort` value (e.g., "low" or "high"), it takes precedence over the global environment setting. This allows users to customize reasoning depth on a per-conversation basis.
700+
The per-chat `reasoning_effort` value can override the environment-level `GOOGLE_THINKING_LEVEL` setting. When a chat specifies a `reasoning_effort` value (for example, `"low"`, `"minimal"`, or `"high"`), it takes precedence over the global environment setting. This allows users to customize reasoning depth on a per-conversation basis.
694701

695702
**Example API Usage:**
696703

@@ -784,11 +791,11 @@ The pipeline automatically extracts token usage metadata from every Gemini respo
784791

785792
### What is tracked
786793

787-
| Field | Description |
788-
| --- | --- |
789-
| `prompt_tokens` | Number of tokens in the input (messages + system prompt) |
790-
| `completion_tokens` | Number of tokens generated by the model |
791-
| `total_tokens` | Sum of prompt and completion tokens |
794+
| Field | Description |
795+
| ------------------- | -------------------------------------------------------- |
796+
| `prompt_tokens` | Number of tokens in the input (messages + system prompt) |
797+
| `completion_tokens` | Number of tokens generated by the model |
798+
| `total_tokens` | Sum of prompt and completion tokens |
792799

793800
### How it works
794801

@@ -800,11 +807,10 @@ No additional configuration is required. Token usage is tracked automatically fo
800807
> [!NOTE]
801808
> Thinking tokens consumed during internal reasoning are **not** included in `completion_tokens` — they are captured separately by the Gemini API in `thoughts_token_count` but are not forwarded to Open WebUI at this time.
802809
803-
### Model Compatibility
810+
### Thinking Compatibility
804811

805-
| Model | thinking_level | thinking_budget |
806-
| ------------------------- | ---------------------------- | ---------------------- |
807-
| gemini-3-\* | ✅ Supported ("low", "high") | ❌ Not used |
808-
| gemini-2.5-\* | ❌ Not used | ✅ Supported (0-32768) |
809-
| gemini-2.5-flash-image-\* | ❌ Not supported | ❌ Not supported |
810-
| Other models | ❌ Not used | ✅ May be supported |
812+
- **`gemini-3.1-flash-image-*`**: `thinking_level` supports `"minimal"` and `"high"`; `thinking_budget` is not used.
813+
- **Other `gemini-3-*` models**: `thinking_level` supports `"low"` and `"high"`; `thinking_budget` is not used.
814+
- **`gemini-2.5-*` models**: `thinking_level` is not used; `thinking_budget` supports `0-32768`.
815+
- **`gemini-2.5-flash-image-*`**: neither `thinking_level` nor `thinking_budget` is supported.
816+
- **Other models**: `thinking_level` is not used; `thinking_budget` may be supported depending on the model.

0 commit comments

Comments
 (0)