Skip to content

chat: fix Responses API tool-chaining, context limits, and streaming correctness#1080

Merged
BenjaminMichaelis merged 4 commits into
mainfrom
agents/ai-chat-review-guidelines-opus-gpt
May 9, 2026
Merged

chat: fix Responses API tool-chaining, context limits, and streaming correctness#1080
BenjaminMichaelis merged 4 commits into
mainfrom
agents/ai-chat-review-guidelines-opus-gpt

Conversation

@BenjaminMichaelis
Copy link
Copy Markdown
Member

Summary

Three rounds of review against the Azure OpenAI Responses API docs (using Claude Opus 4.6 and GPT-5.5 as parallel review subagents) identified and fixed several correctness bugs and robustness gaps in the AI chat implementation.


Critical fixes

C1 — Parallel streaming tool calls created forked conversation branches

Previously, ProcessStreamingUpdatesAsync fired a separate API continuation per tool call inline during the stream. With N parallel tool calls this created N independent conversation branches — each continuation only saw its own tool result, not the others.

Fix: Two-phase pattern — buffer all FunctionCallResponseItems during the stream, execute them sequentially after the stream completes, then make one continuation with all outputs. Matches the non-streaming behavior.

C2 — Context length errors not caught in streaming path

ClientResultException from a context-window overflow propagates through await foreach but C# prohibits yield return inside a try/catch block (CS1626).

Fix: Added RethrowContextLengthErrors helper that puts try/catch only around MoveNextAsync, with yield return outside — the only CS1626-safe pattern for exception remapping in async iterators.


High fixes

H1 — SSE error event lost on mid-stream exception

The second catch (Exception ex) block in StreamMessage was missing its try { await Response.WriteAsync(...) } catch {} body, so clients received no signal when the stream was interrupted.

H2 — CloneOptionsWithPreviousResponseId copied only 3 of 14 fields

Tool-call continuation legs inherited Instructions, PreviousResponseId, and Tools — but not EndUserId, Temperature, TopP, TruncationMode, MaxOutputTokenCount, TextOptions, ParallelToolCallsEnabled, StoredOutputEnabled, ToolChoice, ServiceTier, or Metadata. Continuations were silently using defaults for all of those.

H3 — Context length detection relied on substring matching only

Fix: IsContextLengthError now parses the structured JSON error body via TryExtractErrorCode and checks the error.code field first (context_length_exceeded, token_limit_exceeded). Also handles HTTP 413. Falls back to message substring matching.


Medium fixes (from second review pass)

  • ParseToolArguments unguarded in non-streaming path — a JsonException on malformed model-generated arguments would bubble up as a 500. Now caught identically to the streaming path, returning an error FunctionCallOutputResponseItem so the model can recover.
  • strictModeEnabled: false for MCP tool registration — the MCP .NET SDK generates schemas that don't satisfy OpenAI strict-mode constraints (optional params not in required, no additionalProperties: false, nullable union types). A single non-conforming schema with strictModeEnabled: true causes a 400 that blocks all tools.
  • currentLegResponseId null guard — if the API never emits StreamingResponseCreatedUpdate, the continuation would fire with PreviousResponseId = null, causing a confusing 400. Now throws InvalidOperationException with a diagnostic message.
  • Log original exception in mid-stream context-limit handler — was logging a freshly constructed InvalidOperationException, discarding the original stack trace and PreviousResponseId.

Bonus

  • EndUserId wired upCreateResponseOptionsAsync now sets options.EndUserId (the SDK exposes it; the previous code discarded it). Enables Azure OpenAI abuse monitoring and Defender prompt-shield correlation.
  • LogMcpToolArgumentParseError logger message added.
  • ConversationContextLimitExceededException domain exception with PreviousResponseId property.

Files changed

File Changes
EssentialCSharp.Chat.Shared/Services/AIChatService.cs All service-level fixes
EssentialCSharp.Chat.Shared/Services/ConversationContextLimitExceededException.cs New domain exception
EssentialCSharp.Web/Controllers/ChatController.cs Context-limit handling, SSE error write, logging

- Fix streaming tool continuations: track currentLegResponseId from
  StreamingResponseCreatedUpdate and clone options with that ID before
  calling ExecuteFunctionCallAsync, so the continuation correctly
  references the response that produced the function call (not the
  outer conversation root).

- Fix non-streaming loop token waste: after each tool-call response,
  advance PreviousResponseId to the new responseId and reset
  responseItems to [] so subsequent iterations only send new tool
  outputs — not the growing full history. This prevents quadratic
  token re-processing.

- Remove redundant functionCallItem from continuation inputs: when
  using previous_response_id chaining, the function call is already
  stored server-side in the referenced response; only the
  FunctionCallOutputResponseItem is new content.

- Add ConversationContextLimitExceededException domain exception and
  IsContextLengthError helper to detect context window overflow from
  ClientResultException (HTTP 400 with context_length_exceeded text).

- Handle context limit in ChatController: both /api/chat/message and
  /api/chat/stream return 400 with errorCode 'context_limit_exceeded'.

- Add CloneOptionsWithPreviousResponseId helper to avoid mutating
  shared options across concurrent streaming legs.

- Add ParseToolArguments helper to deduplicate JSON argument parsing
  in both streaming and non-streaming paths.

- Fix OPENAI001 pragma: extend the disable region to cover the full
  loop body including the ClientResult<OpenAIResponse> declaration.
C1 - Two-phase streaming tool call pattern: buffer all function calls
during the stream, execute sequentially after stream completes, then
make ONE continuation with all outputs. Previously, N parallel tool
calls created N forked conversation branches.

C2 - Context length errors in streaming path: added RethrowContextLengthErrors
helper wrapper that puts try/catch only around MoveNextAsync, keeping
yield return outside. C# CS1626 prohibits yield inside try/catch, so
the wrapper is the only valid pattern for exception remapping in async
iterators.

H1 - Restore stream-write body in second catch in StreamMessage:
the SSE error event was lost in a prior edit.

H2 - Clone all relevant ResponseCreationOptions fields in
CloneOptionsWithPreviousResponseId: EndUserId, MaxOutputTokenCount,
TextOptions, TruncationMode, ParallelToolCallsEnabled,
StoredOutputEnabled, ToolChoice, Temperature, TopP, ServiceTier,
Metadata.

H3 - IsContextLengthError now parses structured JSON error code via
TryExtractErrorCode: handles status 413, code token_limit_exceeded,
and context_length_exceeded. Falls back to message substring match.

Bonus: wire EndUserId into CreateResponseOptionsAsync (SDK supports it).
Add LogMcpToolArgumentParseError logger message.
- strictModeEnabled: false for MCP tool registration — external MCP
  schemas are not guaranteed to satisfy OpenAI strict-mode constraints
  (all properties required, additionalProperties: false everywhere).
  A single non-conforming schema with strictModeEnabled: true would
  cause a 400 at registration time for ALL tools in the request.

- Guard ParseToolArguments in non-streaming path — JsonException on
  malformed model-generated arguments would previously bubble up as
  a 500. Now caught identically to the streaming path: logs warning
  and sends an error FunctionCallOutputResponseItem so the model can
  recover without aborting the loop.

- Guard currentLegResponseId null before tool-call continuation —
  if the API never emits StreamingResponseCreatedUpdate, proceeding
  with PreviousResponseId = null causes a confusing 400 from the API.
  Now throws InvalidOperationException with a diagnostic message.

- Log original ConversationContextLimitExceededException in mid-stream
  handler — previously wrapped in a new InvalidOperationException,
  losing the stack trace, PreviousResponseId, and inner exception.
Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs Fixed
Comment thread EssentialCSharp.Web/Controllers/ChatController.cs Fixed
@BenjaminMichaelis BenjaminMichaelis marked this pull request as ready for review May 9, 2026 07:11
Copilot AI review requested due to automatic review settings May 9, 2026 07:11
Replace bare catch with catch (Exception) in two best-effort
error-handling sites, with clarifying comments explaining why a
broad catch is correct.

- TryExtractErrorCode: Azure SDK response internals are outside
  our control; a narrow typed-catch list risks a latent crash in
  production error handling. catch (Exception) is intentional.

- StreamMessage SSE error-write guards (both): Kestrel throws
  IOException (ConnectionResetException), OperationCanceledException,
  or ObjectDisposedException on abrupt client disconnect. The bot's
  OperationCanceledException-only suggestion would miss IOException
  and cause an unhandled exception that masks the original error.
  Both catch blocks updated (bot only flagged one, the other is
  identical).
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Essential C# chat integration to align more closely with the Azure OpenAI Responses API’s tool-calling and streaming behaviors, focusing on correctness under tool-chaining, context-limit failures, and mid-stream error signaling.

Changes:

  • Reworked streaming tool-call handling to buffer tool calls and submit a single continuation with all tool outputs (avoids branched conversations).
  • Added a domain exception for context-window overflows and mapped Responses API context-limit failures (streaming + non-streaming) to that exception.
  • Improved controller behavior for context-limit errors (400 before response start; SSE error event mid-stream) and ensured responseId ownership is recorded for each streaming leg.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
EssentialCSharp.Chat.Shared/Services/AIChatService.cs Fixes streaming tool-call continuation logic; adds context-limit error detection/remapping; improves continuation option cloning and tool-argument parsing robustness.
EssentialCSharp.Chat.Shared/Services/ConversationContextLimitExceededException.cs Introduces a domain exception used to signal conversation context-window overflow (optionally includes previous response id).
EssentialCSharp.Web/Controllers/ChatController.cs Handles the new domain exception in both non-streaming and streaming endpoints, including mid-stream SSE error writes and logging.

Comment on lines +651 to +656
catch (Exception)
{
// Best-effort extraction inside an error handler — catch all to guarantee we never
// throw from error-parsing logic. The Azure SDK response internals are outside our
// control and the exception surface can change across SDK versions.
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping catch (Exception) here intentionally — this is a documented architectural decision, not a code smell.

Two reasons the proposed typed list can't safely replace it:

  1. FormatException is factually incorrectJsonDocument.Parse() throws JsonException (not FormatException), TryGetProperty() returns a bool without throwing, and GetString() throws InvalidOperationException. FormatException would be dead code, which undermines confidence in the rest of the list.

  2. The list is inherently incompleteex.GetRawResponse() reaches into Azure SDK internals whose exception surface can change across SDK versions (as our comment notes). The realistic surface includes at minimum JsonException, InvalidOperationException, ObjectDisposedException, and potentially ArgumentException on certain buffer edge cases — but that's exactly the problem: we can't enumerate it reliably. A typed list that looks complete but isn't is more dangerous than an honest broad catch.

The method's contract is "never throw from inside an error handler." catch (Exception) is the correct tool for that contract.

Comment on lines +154 to +160
catch (Exception)
{
// Best-effort write to an already-streaming response. Kestrel can throw
// IOException (connection reset), OperationCanceledException, or
// ObjectDisposedException on abrupt client disconnect — swallow all to
// avoid masking the original exception.
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping catch (Exception) here intentionally — the when filter would introduce a concrete regression risk in this pattern.

This inner try/catch sits inside an outer catch block that has already logged the original exception. If the when filter doesn't match (e.g., because Kestrel introduces a new transport exception type in a future ASP.NET Core version — e.g., ConnectionAbortedException, HttpProtocolException are both real types in newer ASP.NET Core), the unhandled exception propagates out of the outer catch block. In C#, that unwinds the stack with the new exception, potentially:

  • Losing the SSE response mid-frame
  • Creating confusing telemetry where the original (already-logged) exception is overshadowed

"Best-effort" here means unconditional: any write failure must be swallowed. The comment enumerates IOException, OperationCanceledException, and ObjectDisposedException as examples of what Kestrel throws, not as a complete contract. That's the right level of documentation. The when filter would enforce a false completeness claim.

Comment on lines +177 to +183
catch (Exception)
{
// Best-effort write to an already-streaming response. Kestrel can throw
// IOException (connection reset), OperationCanceledException, or
// ObjectDisposedException on abrupt client disconnect — swallow all to
// avoid masking the original exception.
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as the other SSE block above — keeping catch (Exception) intentionally.

This pattern appears in both mid-stream catch blocks. If the when filter misses an exception type (e.g., HttpProtocolException, ConnectionAbortedException, or any other type ASP.NET Core may throw on connection teardown in future versions), the exception would propagate out of the outer catch (Exception ex) block, masking the original error that was already logged. The best-effort semantics require unconditional swallow. catch (Exception) is correct here.

@BenjaminMichaelis BenjaminMichaelis merged commit de6d616 into main May 9, 2026
7 checks passed
@BenjaminMichaelis BenjaminMichaelis deleted the agents/ai-chat-review-guidelines-opus-gpt branch May 9, 2026 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants