Fix Gemini official tool history thought signatures#581
Conversation
📝 WalkthroughWalkthroughThis PR enables a new ChangesGemini-native executor and tool-call flow
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86e7251a1a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const configuredGeminiRequest = applyConfiguredPayloadRules(geminiRequest); | ||
| const action = input.stream ? 'streamGenerateContent' : 'generateContent'; | ||
| return { | ||
| path: resolveGeminiNativeEndpointPath(input.stream), |
There was a problem hiding this comment.
Preserve Gemini native authentication when bypassing compat
When this branch switches official Gemini chat requests with assistant tool-call history to the native generateContent path, it only returns a relative /v1beta/models/... path with the OpenAI-compat Authorization: Bearer ... header. The existing Gemini native surface builds these calls through the Gemini URL resolver with the API key in the native query string, so default Gemini API-key sites will authenticate normal OpenAI-compat chat calls but get 401/403 for the new tool-history path; the same relative path is also appended under preserved /v1beta/openai bases. Build the native Gemini URL/auth consistently before dispatching this fallback.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/server/proxy-core/surfaces/chatSurface.ts (2)
127-130: 💤 Low valueUnused
streamContextcreation.Lines 127-130 create and populate a
streamContext, but it's never used afterward. ThebuildSyntheticChunkscall on line 134 only usesnormalizedFinal, which already containsid,model, andcreated.♻️ Remove unused code
const geminiFinal = geminiGenerateContentTransformer.outbound.serializeAggregateResponse(aggregate); const normalizedFinal = openAiChatTransformer.transformFinalResponse(geminiFinal, modelName, rawText); - const streamContext = openAiChatTransformer.createStreamContext(modelName); - streamContext.id = normalizedFinal.id; - streamContext.model = normalizedFinal.model; - streamContext.created = normalizedFinal.created; return { finalPayload: geminiFinal,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server/proxy-core/surfaces/chatSurface.ts` around lines 127 - 130, The code creates and populates a streamContext via openAiChatTransformer.createStreamContext (assigned to streamContext and setting id, model, created) but never uses it; either remove the unused streamContext creation/update or actually pass it into the downstream logic (e.g. into buildSyntheticChunks) so it’s consumed. Find the streamContext creation near where buildSyntheticChunks is called and either delete the three lines that set streamContext (streamContext = openAiChatTransformer.createStreamContext(...); streamContext.id = ...; streamContext.model = ...; streamContext.created = ...) or modify the consumer (buildSyntheticChunks) to accept and use streamContext instead of relying solely on normalizedFinal.
688-717: ⚖️ Poor tradeoffGemini native streaming buffers entire response.
Line 689 reads the complete upstream response into memory before converting and sending it downstream. This breaks true streaming behavior and can cause:
- Increased latency: Users wait for the entire upstream response before seeing any output.
- Memory pressure: Large responses are fully buffered.
This is a tradeoff for format conversion correctness (Gemini
functionCall→ OpenAItool_calls), but consider whether incremental streaming with progressive conversion is feasible for future optimization. The normal streaming path (lines 878-943) does true incremental streaming; this special case deviates from that pattern.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/server/proxy-core/surfaces/chatSurface.ts` around lines 688 - 717, The current Gemini native runtime path buffers the entire upstream response via readRuntimeResponseText and then converts/sends it (buildOpenAiStreamLinesFromGeminiNativeSse), causing latency and memory pressure; change this to incremental streaming by consuming the upstream stream chunk-by-chunk, passing each chunk into a streaming converter (or refactor buildOpenAiStreamLinesFromGeminiNativeSse into a streaming variant) and calling writeLines for each converted chunk as it arrives, while incrementally updating upstreamUsagePresent and parsedUsage (using hasProxyUsagePayload/parseProxyUsage on each partial payload), and only calling streamResponse.end, recordStreamSuccess, finalizeDebugSuccess and bindSurfaceStickyChannel after the upstream stream closes; keep the same success/error handling and debug payload logic but produce and merge partial usage and debug chunks rather than buffering rawText.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/server/proxy-core/surfaces/chatSurface.ts`:
- Around line 127-130: The code creates and populates a streamContext via
openAiChatTransformer.createStreamContext (assigned to streamContext and setting
id, model, created) but never uses it; either remove the unused streamContext
creation/update or actually pass it into the downstream logic (e.g. into
buildSyntheticChunks) so it’s consumed. Find the streamContext creation near
where buildSyntheticChunks is called and either delete the three lines that set
streamContext (streamContext = openAiChatTransformer.createStreamContext(...);
streamContext.id = ...; streamContext.model = ...; streamContext.created = ...)
or modify the consumer (buildSyntheticChunks) to accept and use streamContext
instead of relying solely on normalizedFinal.
- Around line 688-717: The current Gemini native runtime path buffers the entire
upstream response via readRuntimeResponseText and then converts/sends it
(buildOpenAiStreamLinesFromGeminiNativeSse), causing latency and memory
pressure; change this to incremental streaming by consuming the upstream stream
chunk-by-chunk, passing each chunk into a streaming converter (or refactor
buildOpenAiStreamLinesFromGeminiNativeSse into a streaming variant) and calling
writeLines for each converted chunk as it arrives, while incrementally updating
upstreamUsagePresent and parsedUsage (using hasProxyUsagePayload/parseProxyUsage
on each partial payload), and only calling streamResponse.end,
recordStreamSuccess, finalizeDebugSuccess and bindSurfaceStickyChannel after the
upstream stream closes; keep the same success/error handling and debug payload
logic but produce and merge partial usage and debug chunks rather than buffering
rawText.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c6f4aeef-da32-4ebd-8788-c35c5cbac22b
📒 Files selected for processing (10)
src/server/proxy-core/executors/types.tssrc/server/proxy-core/orchestration/endpointFlow.tssrc/server/proxy-core/providers/types.tssrc/server/proxy-core/surfaces/chatSurface.tssrc/server/services/upstreamRequestBuilder.test.tssrc/server/services/upstreamRequestBuilder.tssrc/server/transformers/gemini/generate-content/requestBridge.thoughtSignature.test.tssrc/server/transformers/gemini/generate-content/requestBridge.tssrc/server/transformers/shared/chatFormatsCore.tssrc/server/transformers/shared/normalized.test.ts
Summary
Fixes Gemini official chat-completions tool-history failures by routing OpenAI chat requests with assistant
tool_callsthrough the native GeminigenerateContentbridge.Closes #580.
What changed
/v1/chat/completionsrequests that replay assistant tool-call history.generateContent/streamGenerateContentrequests for that case instead of sending the history to/v1beta/openai/chat/completions.thoughtSignaturevalues for Gemini 3 function-call parts when downstream OpenAI clients cannot provide Gemini provider metadata.functionCallresponses back into OpenAI-compatibletool_callsfor non-stream and stream chat responses.gemini-native.Why
Gemini 3.x official upstreams can reject OpenAI-compatible chat requests with prior assistant
tool_calls:PR #135 fixed the Gemini CLI/native conversion path, but the Gemini official OpenAI-compatible chat route could still hit
/v1beta/openai/chat/completionswith unsigned tool-call history. This change keeps the downstream OpenAI API stable while using Gemini's native format where the required tool-history metadata can be represented.Verification
npx vitest run --root . src/server/services/upstreamRequestBuilder.test.ts src/server/transformers/gemini/generate-content/requestBridge.thoughtSignature.test.ts src/server/transformers/shared/normalized.test.tsnpx vitest run --root . src/server/transformers/gemini/generate-content/index.test.ts src/server/transformers/gemini/generate-content/streamBridge.test.ts src/server/transformers/openai/chat/streamBridge.test.ts src/server/routes/proxy/architecture-boundaries.test.tsnpm run typechecknpm testNote:
npm testneeds permission to bind local127.0.0.1test servers; under a restricted sandbox it fails withlisten EPERM, but it passes when run with normal local permissions.Summary by CodeRabbit
New Features
Tests