chat: fix Responses API tool-chaining, context limits, and streaming correctness by BenjaminMichaelis · Pull Request #1080 · IntelliTect/EssentialCSharp.Web

BenjaminMichaelis · 2026-05-09T06:48:44Z

Summary

Three rounds of review against the Azure OpenAI Responses API docs (using Claude Opus 4.6 and GPT-5.5 as parallel review subagents) identified and fixed several correctness bugs and robustness gaps in the AI chat implementation.

Critical fixes

C1 — Parallel streaming tool calls created forked conversation branches

Previously, ProcessStreamingUpdatesAsync fired a separate API continuation per tool call inline during the stream. With N parallel tool calls this created N independent conversation branches — each continuation only saw its own tool result, not the others.

Fix: Two-phase pattern — buffer all FunctionCallResponseItems during the stream, execute them sequentially after the stream completes, then make one continuation with all outputs. Matches the non-streaming behavior.

C2 — Context length errors not caught in streaming path

ClientResultException from a context-window overflow propagates through await foreach but C# prohibits yield return inside a try/catch block (CS1626).

Fix: Added RethrowContextLengthErrors helper that puts try/catch only around MoveNextAsync, with yield return outside — the only CS1626-safe pattern for exception remapping in async iterators.

High fixes

H1 — SSE error event lost on mid-stream exception

The second catch (Exception ex) block in StreamMessage was missing its try { await Response.WriteAsync(...) } catch {} body, so clients received no signal when the stream was interrupted.

H2 — `CloneOptionsWithPreviousResponseId` copied only 3 of 14 fields

Tool-call continuation legs inherited Instructions, PreviousResponseId, and Tools — but not EndUserId, Temperature, TopP, TruncationMode, MaxOutputTokenCount, TextOptions, ParallelToolCallsEnabled, StoredOutputEnabled, ToolChoice, ServiceTier, or Metadata. Continuations were silently using defaults for all of those.

H3 — Context length detection relied on substring matching only

Fix: IsContextLengthError now parses the structured JSON error body via TryExtractErrorCode and checks the error.code field first (context_length_exceeded, token_limit_exceeded). Also handles HTTP 413. Falls back to message substring matching.

Medium fixes (from second review pass)

ParseToolArguments unguarded in non-streaming path — a JsonException on malformed model-generated arguments would bubble up as a 500. Now caught identically to the streaming path, returning an error FunctionCallOutputResponseItem so the model can recover.
strictModeEnabled: false for MCP tool registration — the MCP .NET SDK generates schemas that don't satisfy OpenAI strict-mode constraints (optional params not in required, no additionalProperties: false, nullable union types). A single non-conforming schema with strictModeEnabled: true causes a 400 that blocks all tools.
currentLegResponseId null guard — if the API never emits StreamingResponseCreatedUpdate, the continuation would fire with PreviousResponseId = null, causing a confusing 400. Now throws InvalidOperationException with a diagnostic message.
Log original exception in mid-stream context-limit handler — was logging a freshly constructed InvalidOperationException, discarding the original stack trace and PreviousResponseId.

Bonus

EndUserId wired up — CreateResponseOptionsAsync now sets options.EndUserId (the SDK exposes it; the previous code discarded it). Enables Azure OpenAI abuse monitoring and Defender prompt-shield correlation.
LogMcpToolArgumentParseError logger message added.
ConversationContextLimitExceededException domain exception with PreviousResponseId property.

Files changed

File	Changes
`EssentialCSharp.Chat.Shared/Services/AIChatService.cs`	All service-level fixes
`EssentialCSharp.Chat.Shared/Services/ConversationContextLimitExceededException.cs`	New domain exception
`EssentialCSharp.Web/Controllers/ChatController.cs`	Context-limit handling, SSE error write, logging

- Fix streaming tool continuations: track currentLegResponseId from StreamingResponseCreatedUpdate and clone options with that ID before calling ExecuteFunctionCallAsync, so the continuation correctly references the response that produced the function call (not the outer conversation root). - Fix non-streaming loop token waste: after each tool-call response, advance PreviousResponseId to the new responseId and reset responseItems to [] so subsequent iterations only send new tool outputs — not the growing full history. This prevents quadratic token re-processing. - Remove redundant functionCallItem from continuation inputs: when using previous_response_id chaining, the function call is already stored server-side in the referenced response; only the FunctionCallOutputResponseItem is new content. - Add ConversationContextLimitExceededException domain exception and IsContextLengthError helper to detect context window overflow from ClientResultException (HTTP 400 with context_length_exceeded text). - Handle context limit in ChatController: both /api/chat/message and /api/chat/stream return 400 with errorCode 'context_limit_exceeded'. - Add CloneOptionsWithPreviousResponseId helper to avoid mutating shared options across concurrent streaming legs. - Add ParseToolArguments helper to deduplicate JSON argument parsing in both streaming and non-streaming paths. - Fix OPENAI001 pragma: extend the disable region to cover the full loop body including the ClientResult<OpenAIResponse> declaration.

C1 - Two-phase streaming tool call pattern: buffer all function calls during the stream, execute sequentially after stream completes, then make ONE continuation with all outputs. Previously, N parallel tool calls created N forked conversation branches. C2 - Context length errors in streaming path: added RethrowContextLengthErrors helper wrapper that puts try/catch only around MoveNextAsync, keeping yield return outside. C# CS1626 prohibits yield inside try/catch, so the wrapper is the only valid pattern for exception remapping in async iterators. H1 - Restore stream-write body in second catch in StreamMessage: the SSE error event was lost in a prior edit. H2 - Clone all relevant ResponseCreationOptions fields in CloneOptionsWithPreviousResponseId: EndUserId, MaxOutputTokenCount, TextOptions, TruncationMode, ParallelToolCallsEnabled, StoredOutputEnabled, ToolChoice, Temperature, TopP, ServiceTier, Metadata. H3 - IsContextLengthError now parses structured JSON error code via TryExtractErrorCode: handles status 413, code token_limit_exceeded, and context_length_exceeded. Falls back to message substring match. Bonus: wire EndUserId into CreateResponseOptionsAsync (SDK supports it). Add LogMcpToolArgumentParseError logger message.

- strictModeEnabled: false for MCP tool registration — external MCP schemas are not guaranteed to satisfy OpenAI strict-mode constraints (all properties required, additionalProperties: false everywhere). A single non-conforming schema with strictModeEnabled: true would cause a 400 at registration time for ALL tools in the request. - Guard ParseToolArguments in non-streaming path — JsonException on malformed model-generated arguments would previously bubble up as a 500. Now caught identically to the streaming path: logs warning and sends an error FunctionCallOutputResponseItem so the model can recover without aborting the loop. - Guard currentLegResponseId null before tool-call continuation — if the API never emits StreamingResponseCreatedUpdate, proceeding with PreviousResponseId = null causes a confusing 400 from the API. Now throws InvalidOperationException with a diagnostic message. - Log original ConversationContextLimitExceededException in mid-stream handler — previously wrapped in a new InvalidOperationException, losing the stack trace, PreviousResponseId, and inner exception.

Replace bare catch with catch (Exception) in two best-effort error-handling sites, with clarifying comments explaining why a broad catch is correct. - TryExtractErrorCode: Azure SDK response internals are outside our control; a narrow typed-catch list risks a latent crash in production error handling. catch (Exception) is intentional. - StreamMessage SSE error-write guards (both): Kestrel throws IOException (ConnectionResetException), OperationCanceledException, or ObjectDisposedException on abrupt client disconnect. The bot's OperationCanceledException-only suggestion would miss IOException and cause an unhandled exception that masks the original error. Both catch blocks updated (bot only flagged one, the other is identical).

Copilot

Pull request overview

Updates the Essential C# chat integration to align more closely with the Azure OpenAI Responses API’s tool-calling and streaming behaviors, focusing on correctness under tool-chaining, context-limit failures, and mid-stream error signaling.

Changes:

Reworked streaming tool-call handling to buffer tool calls and submit a single continuation with all tool outputs (avoids branched conversations).
Added a domain exception for context-window overflows and mapped Responses API context-limit failures (streaming + non-streaming) to that exception.
Improved controller behavior for context-limit errors (400 before response start; SSE error event mid-stream) and ensured responseId ownership is recorded for each streaming leg.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
EssentialCSharp.Chat.Shared/Services/AIChatService.cs	Fixes streaming tool-call continuation logic; adds context-limit error detection/remapping; improves continuation option cloning and tool-argument parsing robustness.
EssentialCSharp.Chat.Shared/Services/ConversationContextLimitExceededException.cs	Introduces a domain exception used to signal conversation context-window overflow (optionally includes previous response id).
EssentialCSharp.Web/Controllers/ChatController.cs	Handles the new domain exception in both non-streaming and streaming endpoints, including mid-stream SSE error writes and logging.

BenjaminMichaelis · 2026-05-09T07:34:09Z

+        catch (Exception)
+        {
+            // Best-effort extraction inside an error handler — catch all to guarantee we never
+            // throw from error-parsing logic. The Azure SDK response internals are outside our
+            // control and the exception surface can change across SDK versions.
+        }


Keeping catch (Exception) here intentionally — this is a documented architectural decision, not a code smell.

Two reasons the proposed typed list can't safely replace it:

FormatException is factually incorrect — JsonDocument.Parse() throws JsonException (not FormatException), TryGetProperty() returns a bool without throwing, and GetString() throws InvalidOperationException. FormatException would be dead code, which undermines confidence in the rest of the list.

The list is inherently incomplete — ex.GetRawResponse() reaches into Azure SDK internals whose exception surface can change across SDK versions (as our comment notes). The realistic surface includes at minimum JsonException, InvalidOperationException, ObjectDisposedException, and potentially ArgumentException on certain buffer edge cases — but that's exactly the problem: we can't enumerate it reliably. A typed list that looks complete but isn't is more dangerous than an honest broad catch.

The method's contract is "never throw from inside an error handler." catch (Exception) is the correct tool for that contract.

BenjaminMichaelis · 2026-05-09T07:34:10Z

+            catch (Exception)
+            {
+                // Best-effort write to an already-streaming response. Kestrel can throw
+                // IOException (connection reset), OperationCanceledException, or
+                // ObjectDisposedException on abrupt client disconnect — swallow all to
+                // avoid masking the original exception.
+            }


Keeping catch (Exception) here intentionally — the when filter would introduce a concrete regression risk in this pattern.

This inner try/catch sits inside an outer catch block that has already logged the original exception. If the when filter doesn't match (e.g., because Kestrel introduces a new transport exception type in a future ASP.NET Core version — e.g., ConnectionAbortedException, HttpProtocolException are both real types in newer ASP.NET Core), the unhandled exception propagates out of the outer catch block. In C#, that unwinds the stack with the new exception, potentially:

Losing the SSE response mid-frame

Creating confusing telemetry where the original (already-logged) exception is overshadowed

"Best-effort" here means unconditional: any write failure must be swallowed. The comment enumerates IOException, OperationCanceledException, and ObjectDisposedException as examples of what Kestrel throws, not as a complete contract. That's the right level of documentation. The when filter would enforce a false completeness claim.

BenjaminMichaelis · 2026-05-09T07:34:09Z

+            catch (Exception)
+            {
+                // Best-effort write to an already-streaming response. Kestrel can throw
+                // IOException (connection reset), OperationCanceledException, or
+                // ObjectDisposedException on abrupt client disconnect — swallow all to
+                // avoid masking the original exception.
+            }


Same reasoning as the other SSE block above — keeping catch (Exception) intentionally.

This pattern appears in both mid-stream catch blocks. If the when filter misses an exception type (e.g., HttpProtocolException, ConnectionAbortedException, or any other type ASP.NET Core may throw on connection teardown in future versions), the exception would propagate out of the outer catch (Exception ex) block, masking the original error that was already logged. The best-effort semantics require unconditional swallow. catch (Exception) is correct here.

BenjaminMichaelis added 3 commits May 8, 2026 21:51

github-code-quality Bot found potential problems May 9, 2026

View reviewed changes

Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs Fixed

Comment thread EssentialCSharp.Web/Controllers/ChatController.cs Fixed

BenjaminMichaelis marked this pull request as ready for review May 9, 2026 07:11

Copilot AI review requested due to automatic review settings May 9, 2026 07:11

Copilot started reviewing on behalf of BenjaminMichaelis May 9, 2026 07:11 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

github-code-quality Bot found potential problems May 9, 2026

View reviewed changes

BenjaminMichaelis merged commit de6d616 into main May 9, 2026
7 checks passed

BenjaminMichaelis deleted the agents/ai-chat-review-guidelines-opus-gpt branch May 9, 2026 08:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat: fix Responses API tool-chaining, context limits, and streaming correctness#1080

chat: fix Responses API tool-chaining, context limits, and streaming correctness#1080
BenjaminMichaelis merged 4 commits into
mainfrom
agents/ai-chat-review-guidelines-opus-gpt

BenjaminMichaelis commented May 9, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

BenjaminMichaelis May 9, 2026

Uh oh!

BenjaminMichaelis May 9, 2026

Uh oh!

BenjaminMichaelis May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BenjaminMichaelis commented May 9, 2026

Summary

Critical fixes

C1 — Parallel streaming tool calls created forked conversation branches

C2 — Context length errors not caught in streaming path

High fixes

H1 — SSE error event lost on mid-stream exception

H2 — CloneOptionsWithPreviousResponseId copied only 3 of 14 fields

H3 — Context length detection relied on substring matching only

Medium fixes (from second review pass)

Bonus

Files changed

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

BenjaminMichaelis May 9, 2026

Choose a reason for hiding this comment

Uh oh!

BenjaminMichaelis May 9, 2026

Choose a reason for hiding this comment

Uh oh!

BenjaminMichaelis May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

H2 — `CloneOptionsWithPreviousResponseId` copied only 3 of 14 fields