Skip to content

gptel-openai: Merge tool calls from all choices in non-streaming responses#1284

Open
phrb wants to merge 1 commit intokarthink:masterfrom
phrb:fix/parse-response-multi-choices
Open

gptel-openai: Merge tool calls from all choices in non-streaming responses#1284
phrb wants to merge 1 commit intokarthink:masterfrom
phrb:fix/parse-response-multi-choices

Conversation

@phrb
Copy link
Copy Markdown

@phrb phrb commented Mar 11, 2026

Problem

gptel--parse-response for gptel-openai hardcodes (map-nested-elt response '(:choices 0)), only reading the first entry in the choices array. This silently drops tool calls when an OpenAI-compatible API returns them in separate choices entries.

Affected setup

GitHub Copilot backend (gptel--gh) proxying Claude models (e.g. claude-opus-4.6). Non-streaming requests only — sub-agents (researcher, introspector, executor via gptel-agent) always use stream: false and are completely broken by this.

What happens

The Copilot API proxy reformats Claude's non-streaming responses, placing text content and each tool call in separate choices entries:

Standard OpenAI format (what gptel expects):

{
  "choices": [
    {
      "finish_reason": "tool_calls",
      "message": {
        "content": "I'll search the source code...",
        "tool_calls": [
          {"function": {"name": "Glob", "arguments": "..."}, "id": "..."},
          {"function": {"name": "Grep", "arguments": "..."}, "id": "..."}
        ]
      }
    }
  ]
}

Copilot format (Claude via proxy, stream: false):

{
  "choices": [
    {
      "finish_reason": "tool_calls",
      "message": {
        "content": "I'll search the source code...",
        "role": "assistant"
      }
    },
    {
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "tool_calls": [
          {"function": {"name": "Glob", "arguments": "..."}, "id": "...", "type": "function"}
        ]
      }
    },
    {
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "tool_calls": [
          {"function": {"name": "Grep", "arguments": "..."}, "id": "...", "type": "function"}
        ]
      }
    }
  ]
}

Since the parser only reads choices[0], it finds the text content but no tool_calls:tool-use is never set in info → the FSM transitions to DONE instead of TOOL → the tool call loop never executes.

For sub-agents this means every researcher/introspector/executor returns only its preamble text (e.g. "I'll systematically search the gptel source code...") without ever running any tools.

Streaming is not affected

The streaming parser (gptel-curl--parse-stream) works correctly because:

  1. It uses a different accumulation strategy via choices[0].delta.tool_calls with index fields
  2. Copilot's streaming format already keeps everything in choices[0]

Fix

Iterate over all choices entries, collecting tool_calls and content from whichever entry has them, then merge into a single message structure before the existing processing code runs.

When there is only one choice (standard OpenAI format), the merge block is guarded by (when (> (length choices) 1) ...) and skipped entirely — zero overhead for the common case.

Testing

Tested with GitHub Copilot backend + claude-opus-4.6:

  • Before fix: All sub-agent tool call loops fail silently (3/3 agents return preamble only)
  • After fix: All sub-agent tool call loops complete successfully (3/3 agents execute tools and return substantive results across multiple TOOL→WAIT→TYPE cycles)
  • Streaming requests: Unaffected, work as before

Related issues

@phrb
Copy link
Copy Markdown
Author

phrb commented Mar 11, 2026

This fixes the sub-agent failure reported in karthink/gptel-agent#66 for users on the GitHub Copilot backend with Claude models.

@karthink karthink force-pushed the fix/parse-response-multi-choices branch from 583b1d8 to c1a59ac Compare March 13, 2026 01:49
@karthink
Copy link
Copy Markdown
Owner

karthink commented Mar 13, 2026

I'm assuming that no backend returns a "contents" field outside of choices[0].messages. Why not just:

;; OpenAI returns either non-blank text content or a tool call, not both.
;; However OpenAI-compatible APIs like llama.cpp can include both (#819), so
;; we check for both tool calls and responses independently.
;; Some OpenAI-compatible APIs return tool calls in separate `choices'
;; instead of in choices[0].messages.tool_calls, we merge them
(when-let* ((tool-calls
             (cl-loop
              for choice across choices
              vconcat (map-nested-elt choice '(:message :tool_calls))))
            ((not (eq tool-calls :null))))
  (when (> (length choices) 1)
    (plist-put message :tool_calls tool-calls))
  (gptel--inject-prompt        ; First add the tool call to the prompts list
   (plist-get info :backend) (plist-get info :data) message)

@phrb
Copy link
Copy Markdown
Author

phrb commented Mar 18, 2026

@karthink Do you want your suggestion to be added to this branch, or did you already implement it elsewhere?

@karthink
Copy link
Copy Markdown
Owner

I meant it as a more concise alternative to your implementation in this PR, as part of this PR. I also wanted to know if you see any problem with this implementation.

@phrb phrb force-pushed the fix/parse-response-multi-choices branch 2 times, most recently from 69dc410 to 9c3af0f Compare April 1, 2026 15:06
…onses

GitHub Copilot's API proxy (when forwarding Claude model responses)
returns non-streaming tool call responses with each tool call in a
separate `choices' entry, rather than combining them all under
choices[0].message.tool_calls as the standard OpenAI format does:

  Standard OpenAI format:
    choices[0].message.content = "I'll search..."
    choices[0].message.tool_calls = [{Glob}, {Grep}, {Grep}]

  Copilot format (Claude via proxy):
    choices[0].message = {content: "I'll search..."}
    choices[1].message = {tool_calls: [{Glob}]}
    choices[2].message = {tool_calls: [{Grep}]}
    choices[3].message = {tool_calls: [{Grep}]}

Since gptel--parse-response hardcoded (map-nested-elt response
'(:choices 0)), it only saw the text content in choices[0] and
missed all tool calls entirely.  This caused the FSM to transition
to DONE instead of TOOL, silently breaking any non-streaming tool
call loop -- most visibly gptel-agent sub-agents (researcher,
introspector, executor), which always use stream: false.

Fix: iterate over all choices entries, collecting tool_calls and
content from whichever entry has them, then merge into a single
message structure before the existing processing code runs.  When
there is only one choice (standard format), the merge is skipped
entirely, so there is no overhead for the common case.

Streaming responses are unaffected: the streaming parser
(gptel-curl--parse-stream) uses a different accumulation strategy
via choices[0].delta.tool_calls with index fields, and Copilot's
streaming format already puts everything in choices[0].

Ref: karthink#819 (related: content + tool_calls coexistence)
@phrb phrb force-pushed the fix/parse-response-multi-choices branch from 9c3af0f to 3c061ea Compare April 2, 2026 16:20
@carl-reverb
Copy link
Copy Markdown

Thank you, @phrb for digging into this! I just tested your branch since I'm on the same API/model stack and was frustrated by subagents constantly popping up and failing. This works like a charm!

@phrb
Copy link
Copy Markdown
Author

phrb commented Apr 3, 2026

I've been rebasing it to the latest on main, I couldn't yet revisit it with the suggestions from karthink, but plan to do it.

@karthink
Copy link
Copy Markdown
Owner

karthink commented Apr 4, 2026

I can do it if you're busy, but I need to be sure that my version isn't
missing something important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants