fix: replace goroutine+channel with DoDeadline in MakeRequestWithContext#1925
fix: replace goroutine+channel with DoDeadline in MakeRequestWithContext#1925davidrudduck wants to merge 3 commits intomaximhq:mainfrom
Conversation
Fixes goroutine leak and potential use-after-free when contexts are cancelled before the fasthttp request completes. The previous pattern spawned a goroutine that continued writing to resp after the caller returned, which could corrupt the response buffer. Uses client.DoDeadline with the context's deadline instead. Falls back to a 5-minute safety-net timeout when no deadline is set (rare in practice since callers always set deadlines via NetworkConfig).
📝 WalkthroughSummary by CodeRabbit
WalkthroughReplaces goroutine-based request execution with a deadline-aware approach: uses Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
core/providers/utils/utils.go
Outdated
| // defaultRequestTimeout is a safety-net fallback used only when the context has no deadline. | ||
| // In normal operation, callers always set a deadline via NetworkConfig.DefaultRequestTimeoutInSeconds | ||
| // (default 30s), so this 5-minute ceiling should rarely be reached. | ||
| const defaultRequestTimeout = 5 * time.Minute |
There was a problem hiding this comment.
We dont need to/want to hardcode this. As we have a few users who have 10 mins as timeout in some cases. Can we nuke this somehow?
|
❤️ for the PR @davidrudduck |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@core/providers/utils/utils.go`:
- Around line 125-146: The two cancellation pathways in the error handling (the
ctx.Err() branch and the errors.Is(err, context.Canceled) branch) are confusing
and set differing IsBifrostError values; either consolidate into a single
canonical cancellation handling or add a clarifying comment and ensure both
branches set the same IsBifrostError value consistent with orchestrator retry
semantics (see BifrostError and executeRequestWithRetries); also change the
message built with fmt.Sprintf("Request cancelled or timed out by context: %v",
ctx.Err()) to a lowercase, punctuation-free error string per Go conventions
(e.g., "request cancelled or timed out by context: %v").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 91af49c4-aeec-450d-9400-55d16f9fd31a
📒 Files selected for processing (1)
core/providers/utils/utils.go
| if err != nil { | ||
| // Check if the context was cancelled while the request was in flight | ||
| if ctx.Err() != nil { | ||
| return latency, &schemas.BifrostError{ | ||
| IsBifrostError: true, | ||
| Error: &schemas.ErrorField{ | ||
| Type: schemas.Ptr(schemas.RequestCancelled), | ||
| Message: fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()), | ||
| Error: ctx.Err(), | ||
| }, | ||
| } | ||
| // Check for DNS lookup and network errors after timeout checks | ||
| var opErr *net.OpError | ||
| var dnsErr *net.DNSError | ||
| if errors.As(err, &opErr) || errors.As(err, &dnsErr) { | ||
| return latency, &schemas.BifrostError{ | ||
| IsBifrostError: false, | ||
| Error: &schemas.ErrorField{ | ||
| Message: schemas.ErrProviderNetworkError, | ||
| Error: err, | ||
| }, | ||
| } | ||
| } | ||
| if errors.Is(err, context.Canceled) { | ||
| return latency, &schemas.BifrostError{ | ||
| IsBifrostError: false, | ||
| Error: &schemas.ErrorField{ | ||
| Type: schemas.Ptr(schemas.RequestCancelled), | ||
| Message: schemas.ErrRequestCancelled, | ||
| Error: err, | ||
| }, | ||
| } | ||
| // The HTTP request itself failed (e.g., connection error, fasthttp timeout). | ||
| } |
There was a problem hiding this comment.
Potential redundancy and inconsistent IsBifrostError values.
Two observations:
-
Redundant cancellation check: If
ctx.Err() != nil(line 127) returns aRequestCancellederror, the subsequenterrors.Is(err, context.Canceled)check (line 137) may be unreachable for the same scenario. However, there's a subtle difference:ctx.Err()catches context cancellation that happened during the request, whileerrors.Is(err, context.Canceled)catches when fasthttp itself returnscontext.Canceled(e.g., from a wrapped context). If this distinction is intentional, consider adding a comment to clarify. -
IsBifrostErrorinconsistency: Line 129 setsIsBifrostError: truewhile line 139 setsIsBifrostError: falsefor similar cancellation scenarios. Per theBifrostErrorsemantics and the orchestrator's retry logic inexecuteRequestWithRetries, this affects how errors are processed. Verify this is intentional—typically cancellation errors should have consistent classification. -
Error message capitalization: Line 132's message
"Request cancelled or timed out by context: %v"starts with uppercase. Go convention requires error strings to be lowercase with no trailing punctuation.
Suggested fix for error message capitalization
Error: &schemas.ErrorField{
Type: schemas.Ptr(schemas.RequestCancelled),
- Message: fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()),
+ Message: fmt.Sprintf("request cancelled or timed out by context: %v", ctx.Err()),
Error: ctx.Err(),
},As per coding guidelines: "Error strings must be lowercase with no trailing punctuation per Go conventions."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if err != nil { | |
| // Check if the context was cancelled while the request was in flight | |
| if ctx.Err() != nil { | |
| return latency, &schemas.BifrostError{ | |
| IsBifrostError: true, | |
| Error: &schemas.ErrorField{ | |
| Type: schemas.Ptr(schemas.RequestCancelled), | |
| Message: fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()), | |
| Error: ctx.Err(), | |
| }, | |
| } | |
| // Check for DNS lookup and network errors after timeout checks | |
| var opErr *net.OpError | |
| var dnsErr *net.DNSError | |
| if errors.As(err, &opErr) || errors.As(err, &dnsErr) { | |
| return latency, &schemas.BifrostError{ | |
| IsBifrostError: false, | |
| Error: &schemas.ErrorField{ | |
| Message: schemas.ErrProviderNetworkError, | |
| Error: err, | |
| }, | |
| } | |
| } | |
| if errors.Is(err, context.Canceled) { | |
| return latency, &schemas.BifrostError{ | |
| IsBifrostError: false, | |
| Error: &schemas.ErrorField{ | |
| Type: schemas.Ptr(schemas.RequestCancelled), | |
| Message: schemas.ErrRequestCancelled, | |
| Error: err, | |
| }, | |
| } | |
| // The HTTP request itself failed (e.g., connection error, fasthttp timeout). | |
| } | |
| if err != nil { | |
| // Check if the context was cancelled while the request was in flight | |
| if ctx.Err() != nil { | |
| return latency, &schemas.BifrostError{ | |
| IsBifrostError: true, | |
| Error: &schemas.ErrorField{ | |
| Type: schemas.Ptr(schemas.RequestCancelled), | |
| Message: fmt.Sprintf("request cancelled or timed out by context: %v", ctx.Err()), | |
| Error: ctx.Err(), | |
| }, | |
| } | |
| } | |
| if errors.Is(err, context.Canceled) { | |
| return latency, &schemas.BifrostError{ | |
| IsBifrostError: false, | |
| Error: &schemas.ErrorField{ | |
| Type: schemas.Ptr(schemas.RequestCancelled), | |
| Message: schemas.ErrRequestCancelled, | |
| Error: err, | |
| }, | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@core/providers/utils/utils.go` around lines 125 - 146, The two cancellation
pathways in the error handling (the ctx.Err() branch and the errors.Is(err,
context.Canceled) branch) are confusing and set differing IsBifrostError values;
either consolidate into a single canonical cancellation handling or add a
clarifying comment and ensure both branches set the same IsBifrostError value
consistent with orchestrator retry semantics (see BifrostError and
executeRequestWithRetries); also change the message built with
fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()) to a
lowercase, punctuation-free error string per Go conventions (e.g., "request
cancelled or timed out by context: %v").
Per maintainer feedback, remove the 5-minute hardcoded timeout. When the context has a deadline, use DoDeadline (goroutine-safe). When no deadline is set, fall back to client.Do which respects the client's own timeout configuration.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
core/providers/utils/utils.go (1)
118-137: Clarify the intentional distinction between the two cancellation paths.The two cancellation branches have different
IsBifrostErrorvalues:
- Line 120:
IsBifrostError: true— context was directly cancelled/timed out- Line 130:
IsBifrostError: false— error chain containscontext.Canceledbutctx.Err()is nilBoth paths set
Type: RequestCancelled, so retry and fallback logic will behave identically (break retry loop, skip fallbacks). However, theIsBifrostErrordistinction is subtle and could cause confusion during debugging.Consider adding a brief comment explaining when the second path (
errors.Is(err, context.Canceled)withctx.Err() == nil) would be reached—e.g., if a nested/derived context was cancelled while the outer context remains valid.📝 Suggested clarifying comment
if ctx.Err() != nil { return latency, &schemas.BifrostError{ IsBifrostError: true, Error: &schemas.ErrorField{ Type: schemas.Ptr(schemas.RequestCancelled), Message: fmt.Sprintf("request cancelled or timed out by context: %v", ctx.Err()), Error: ctx.Err(), }, } } + // This branch handles edge cases where the error chain contains context.Canceled + // but the parent context is still valid (e.g., a nested context was cancelled). + // IsBifrostError is false since this may indicate a transient provider-side issue. if errors.Is(err, context.Canceled) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/providers/utils/utils.go` around lines 118 - 137, Add a brief clarifying comment above the second cancellation branch (the block that checks errors.Is(err, context.Canceled)) explaining why IsBifrostError is set to false there versus true for the ctx.Err() check: note that the first branch (ctx.Err() != nil) is hit when the current context itself was cancelled/timed out, while the second branch is reached when the error chain contains context.Canceled (e.g., a nested/derived context or upstream operation was cancelled) even though the outer ctx remains non-nil; reference the symbols ctx.Err(), errors.Is(err, context.Canceled), IsBifrostError and schemas.BifrostError in the comment so future readers understand the intent and retry/fallback behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@core/providers/utils/utils.go`:
- Around line 118-137: Add a brief clarifying comment above the second
cancellation branch (the block that checks errors.Is(err, context.Canceled))
explaining why IsBifrostError is set to false there versus true for the
ctx.Err() check: note that the first branch (ctx.Err() != nil) is hit when the
current context itself was cancelled/timed out, while the second branch is
reached when the error chain contains context.Canceled (e.g., a nested/derived
context or upstream operation was cancelled) even though the outer ctx remains
non-nil; reference the symbols ctx.Err(), errors.Is(err, context.Canceled),
IsBifrostError and schemas.BifrostError in the comment so future readers
understand the intent and retry/fallback behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8561dd1e-30ee-49dc-9c85-58e53ed000c6
📒 Files selected for processing (1)
core/providers/utils/utils.go
|
Updated per your feedback — removed the hardcoded |
Summary
Fixes a goroutine leak and potential use-after-free in
MakeRequestWithContextby replacing the goroutine+channel pattern withclient.DoDeadline.The previous implementation spawned a goroutine calling
client.Do(req, resp)and usedselectto race it againstctx.Done(). When the context cancelled first, the goroutine continued running and could write torespafter the caller had already returned — causing use-after-free of the response buffer.Changes
client.Doin a goroutine withclient.DoDeadline(req, resp, deadline)— synchronous, no goroutine neededctx.Deadline(), falling back to a 5-minute safety-net timeoutctx.Err()check afterDoDeadlinereturns to detect in-flight cancellationType of change
Affected areas
How to test
The fix is in the request execution path — all existing provider tests exercise this code:
To verify goroutine leak is fixed, send requests with short timeouts and confirm goroutine count stabilizes (previously it would grow unbounded under cancellation).
Screenshots/Recordings
N/A
Breaking changes
Security considerations
Eliminates use-after-free of fasthttp response buffers when contexts cancel during in-flight requests.
Checklist