Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 16 additions & 8 deletions docs/docs/providers/openai_responses_limitations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -262,14 +262,6 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp

---

### Parallel Tool Calls

**Status:** Rumored Issue

There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.

---

## Resolved Issues

The following limitations have been addressed in recent releases:
Expand Down Expand Up @@ -297,3 +289,19 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
**Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)

MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.

---

### Parallel tool calls

**Status:** ✅ Resolved

The [`parallel_tool_calls` parameter](https://platform.openai.com/docs/api-reference/responses/create#responses_create-parallel_tool_calls) controls turn-based function calling workflows, _not_ parallelism or concurrency. See the [related function calling documentation](https://platform.openai.com/docs/guides/function-calling#parallel-function-calling).

If `parallel_tool_calls=false`, the intended behavior is that multiple generated functional calls will be executed once per turn until done; the client is responsible for executing them one at a time and returning the result, in the expected format, in order to proceed.

For example, with a custom tool generation request with a `get_weather` function definition, the input of "What is the weather in Tokyo and New York?" will, by default, cause two function calls to be generated - a `get_weather` function call definition for each of `Paris` and `New York`. With `parallel_tool_calls = false`, however, only one of these will be generated initially; the client is then responsible for executing that function call and appending the results to the message history, after which the conversation will proceed with the model-generated second function tool call definition.

| parallel_tool_calls=true | parallel_tool_calls=false |
|------|-------|
| <img width="1134" height="1330" alt="Image" src="https://github.com/user-attachments/assets/68b5d6f0-0407-4926-9634-228512aa420d" /> | <img width="1236" height="1868" alt="Image" src="https://github.com/user-attachments/assets/42a1243c-4268-40d0-abcf-ad1bf9abc9c0" /> |
Loading