Skip to content

Fix Gemini official tool history thought signatures#581

Open
FIELA wants to merge 1 commit into
cita-777:mainfrom
FIELA:fix/gemini-official-tool-history-native
Open

Fix Gemini official tool history thought signatures#581
FIELA wants to merge 1 commit into
cita-777:mainfrom
FIELA:fix/gemini-official-tool-history-native

Conversation

@FIELA

@FIELA FIELA commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Fixes Gemini official chat-completions tool-history failures by routing OpenAI chat requests with assistant tool_calls through the native Gemini generateContent bridge.

Closes #580.

What changed

  • Detect Gemini official /v1/chat/completions requests that replay assistant tool-call history.
  • Build native Gemini generateContent / streamGenerateContent requests for that case instead of sending the history to /v1beta/openai/chat/completions.
  • Add safe dummy thoughtSignature values for Gemini 3 function-call parts when downstream OpenAI clients cannot provide Gemini provider metadata.
  • Convert native Gemini functionCall responses back into OpenAI-compatible tool_calls for non-stream and stream chat responses.
  • Extend runtime executor typing with gemini-native.
  • Add regression coverage for request routing, thought-signature injection, and response normalization.

Why

Gemini 3.x official upstreams can reject OpenAI-compatible chat requests with prior assistant tool_calls:

Function call is missing a thought_signature in functionCall parts.

PR #135 fixed the Gemini CLI/native conversion path, but the Gemini official OpenAI-compatible chat route could still hit /v1beta/openai/chat/completions with unsigned tool-call history. This change keeps the downstream OpenAI API stable while using Gemini's native format where the required tool-history metadata can be represented.

Verification

  • npx vitest run --root . src/server/services/upstreamRequestBuilder.test.ts src/server/transformers/gemini/generate-content/requestBridge.thoughtSignature.test.ts src/server/transformers/shared/normalized.test.ts
  • npx vitest run --root . src/server/transformers/gemini/generate-content/index.test.ts src/server/transformers/gemini/generate-content/streamBridge.test.ts src/server/transformers/openai/chat/streamBridge.test.ts src/server/routes/proxy/architecture-boundaries.test.ts
  • npm run typecheck
  • npm test

Note: npm test needs permission to bind local 127.0.0.1 test servers; under a restricted sandbox it fails with listen EPERM, but it passes when run with normal local permissions.

Summary by CodeRabbit

  • New Features

    • Added support for Gemini-native runtime execution path.
    • Enabled tool call handling and format conversion for native Gemini requests.
    • Added streaming and non-streaming response bridging between Gemini and OpenAI formats.
  • Tests

    • Added comprehensive test coverage for Gemini-native request transformation and response normalization.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR enables a new 'gemini-native' executor that routes Gemini requests with tool-call history through the native generateContent/streamGenerateContent API. The implementation spans type definitions, upstream request routing, request transformation with thought signature injection, response normalization, and streaming/response bridging in the chat surface.

Changes

Gemini-native executor and tool-call flow

Layer / File(s) Summary
Executor type contracts
src/server/proxy-core/executors/types.ts, src/server/proxy-core/orchestration/endpointFlow.ts, src/server/proxy-core/providers/types.ts
The executor type union is extended across runtime descriptor types to include 'gemini-native'.
Upstream request building for Gemini-native
src/server/services/upstreamRequestBuilder.ts, src/server/services/upstreamRequestBuilder.test.ts
Detects OpenAI assistant tool calls and constructs Gemini-native generateContent/streamGenerateContent requests with proper endpoint paths, streaming headers, and gemini-native executor tagging. Test validates the generated request structure and signed function calls.
Gemini request transformation with thoughtSignature
src/server/transformers/gemini/generate-content/requestBridge.ts, src/server/transformers/gemini/generate-content/requestBridge.thoughtSignature.test.ts
A helper detects Gemini 3.x models and injects dummy thoughtSignature into function calls. The conditional logic triggers signature injection when thinking is enabled or when the model requires it, for safe models. Test verifies injection without explicit thinking config.
Gemini tool-call response normalization
src/server/transformers/shared/chatFormatsCore.ts, src/server/transformers/shared/normalized.test.ts
Extracts Gemini functionCall parts into normalized tool calls with stable ids and serialized arguments. Sets content to empty and finish reason to 'tool_calls' for tool-bearing responses. Test validates full normalization and OpenAI serialization.
Chat surface Gemini-native streaming and response bridging
src/server/proxy-core/surfaces/chatSurface.ts
Detects Gemini-native upstream paths and applies Gemini-to-OpenAI bridging. For streamed responses, reads full upstream text, converts to OpenAI SSE lines with [DONE] marker, and extracts usage. For non-streamed responses, uses Gemini-native-aware final response construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • cita-777/metapi#135: Both PRs update OpenAI→Gemini conversion for functionCall parts to include thoughtSignature injection, with similar approach in the requestBridge.ts generate-content flow.
  • cita-777/metapi#147: Both PRs extend the proxy runtime executor plumbing; the retrieved PR adds the infrastructure while this PR updates the executor union to include 'gemini-native' and routes through it.

Suggested labels

area: server, size: M

Poem

🐰 A native Gemini path, swift and true,
Tool calls bridge from OpenAI's crew,
Thought signatures signed, responses blessed,
The proxy now serves Gemini's best! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding Gemini native tool history thought signature support to handle tool-call routing and response transformation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added area: server Server-side API and backend changes size: M 200 to 499 lines changed labels Jun 10, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 86e7251a1a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const configuredGeminiRequest = applyConfiguredPayloadRules(geminiRequest);
const action = input.stream ? 'streamGenerateContent' : 'generateContent';
return {
path: resolveGeminiNativeEndpointPath(input.stream),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve Gemini native authentication when bypassing compat

When this branch switches official Gemini chat requests with assistant tool-call history to the native generateContent path, it only returns a relative /v1beta/models/... path with the OpenAI-compat Authorization: Bearer ... header. The existing Gemini native surface builds these calls through the Gemini URL resolver with the API key in the native query string, so default Gemini API-key sites will authenticate normal OpenAI-compat chat calls but get 401/403 for the new tool-history path; the same relative path is also appended under preserved /v1beta/openai bases. Build the native Gemini URL/auth consistently before dispatching this fallback.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/server/proxy-core/surfaces/chatSurface.ts (2)

127-130: 💤 Low value

Unused streamContext creation.

Lines 127-130 create and populate a streamContext, but it's never used afterward. The buildSyntheticChunks call on line 134 only uses normalizedFinal, which already contains id, model, and created.

♻️ Remove unused code
   const geminiFinal = geminiGenerateContentTransformer.outbound.serializeAggregateResponse(aggregate);
   const normalizedFinal = openAiChatTransformer.transformFinalResponse(geminiFinal, modelName, rawText);
-  const streamContext = openAiChatTransformer.createStreamContext(modelName);
-  streamContext.id = normalizedFinal.id;
-  streamContext.model = normalizedFinal.model;
-  streamContext.created = normalizedFinal.created;
   return {
     finalPayload: geminiFinal,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server/proxy-core/surfaces/chatSurface.ts` around lines 127 - 130, The
code creates and populates a streamContext via
openAiChatTransformer.createStreamContext (assigned to streamContext and setting
id, model, created) but never uses it; either remove the unused streamContext
creation/update or actually pass it into the downstream logic (e.g. into
buildSyntheticChunks) so it’s consumed. Find the streamContext creation near
where buildSyntheticChunks is called and either delete the three lines that set
streamContext (streamContext = openAiChatTransformer.createStreamContext(...);
streamContext.id = ...; streamContext.model = ...; streamContext.created = ...)
or modify the consumer (buildSyntheticChunks) to accept and use streamContext
instead of relying solely on normalizedFinal.

688-717: ⚖️ Poor tradeoff

Gemini native streaming buffers entire response.

Line 689 reads the complete upstream response into memory before converting and sending it downstream. This breaks true streaming behavior and can cause:

  1. Increased latency: Users wait for the entire upstream response before seeing any output.
  2. Memory pressure: Large responses are fully buffered.

This is a tradeoff for format conversion correctness (Gemini functionCall → OpenAI tool_calls), but consider whether incremental streaming with progressive conversion is feasible for future optimization. The normal streaming path (lines 878-943) does true incremental streaming; this special case deviates from that pattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server/proxy-core/surfaces/chatSurface.ts` around lines 688 - 717, The
current Gemini native runtime path buffers the entire upstream response via
readRuntimeResponseText and then converts/sends it
(buildOpenAiStreamLinesFromGeminiNativeSse), causing latency and memory
pressure; change this to incremental streaming by consuming the upstream stream
chunk-by-chunk, passing each chunk into a streaming converter (or refactor
buildOpenAiStreamLinesFromGeminiNativeSse into a streaming variant) and calling
writeLines for each converted chunk as it arrives, while incrementally updating
upstreamUsagePresent and parsedUsage (using hasProxyUsagePayload/parseProxyUsage
on each partial payload), and only calling streamResponse.end,
recordStreamSuccess, finalizeDebugSuccess and bindSurfaceStickyChannel after the
upstream stream closes; keep the same success/error handling and debug payload
logic but produce and merge partial usage and debug chunks rather than buffering
rawText.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/server/proxy-core/surfaces/chatSurface.ts`:
- Around line 127-130: The code creates and populates a streamContext via
openAiChatTransformer.createStreamContext (assigned to streamContext and setting
id, model, created) but never uses it; either remove the unused streamContext
creation/update or actually pass it into the downstream logic (e.g. into
buildSyntheticChunks) so it’s consumed. Find the streamContext creation near
where buildSyntheticChunks is called and either delete the three lines that set
streamContext (streamContext = openAiChatTransformer.createStreamContext(...);
streamContext.id = ...; streamContext.model = ...; streamContext.created = ...)
or modify the consumer (buildSyntheticChunks) to accept and use streamContext
instead of relying solely on normalizedFinal.
- Around line 688-717: The current Gemini native runtime path buffers the entire
upstream response via readRuntimeResponseText and then converts/sends it
(buildOpenAiStreamLinesFromGeminiNativeSse), causing latency and memory
pressure; change this to incremental streaming by consuming the upstream stream
chunk-by-chunk, passing each chunk into a streaming converter (or refactor
buildOpenAiStreamLinesFromGeminiNativeSse into a streaming variant) and calling
writeLines for each converted chunk as it arrives, while incrementally updating
upstreamUsagePresent and parsedUsage (using hasProxyUsagePayload/parseProxyUsage
on each partial payload), and only calling streamResponse.end,
recordStreamSuccess, finalizeDebugSuccess and bindSurfaceStickyChannel after the
upstream stream closes; keep the same success/error handling and debug payload
logic but produce and merge partial usage and debug chunks rather than buffering
rawText.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6f4aeef-da32-4ebd-8788-c35c5cbac22b

📥 Commits

Reviewing files that changed from the base of the PR and between e72d19e and 86e7251.

📒 Files selected for processing (10)
  • src/server/proxy-core/executors/types.ts
  • src/server/proxy-core/orchestration/endpointFlow.ts
  • src/server/proxy-core/providers/types.ts
  • src/server/proxy-core/surfaces/chatSurface.ts
  • src/server/services/upstreamRequestBuilder.test.ts
  • src/server/services/upstreamRequestBuilder.ts
  • src/server/transformers/gemini/generate-content/requestBridge.thoughtSignature.test.ts
  • src/server/transformers/gemini/generate-content/requestBridge.ts
  • src/server/transformers/shared/chatFormatsCore.ts
  • src/server/transformers/shared/normalized.test.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: server Server-side API and backend changes size: M 200 to 499 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini official chat rejects tool-call history without thought_signature

1 participant