-
Notifications
You must be signed in to change notification settings - Fork 92
feat: add custom key selector and request latency tracking #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a pluggable key selector (configurable, defaults to weighted random), stores the selected key in request context, migrates provider queues and ChannelMessage handling to pointer channels, and standardizes per-request latency propagation as int64 milliseconds across providers and streaming paths. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Bifrost
participant KeySelector
participant ProviderQueue
participant Provider
Note over Client,Bifrost: Incoming request (model, provider)
Client->>Bifrost: handleRequest(ctx, req)
Bifrost->>Bifrost: determine eligible keys (apply provider constraints)
alt multiple eligible keys
Bifrost->>KeySelector: KeySelector(ctx, keys, provider, model)
KeySelector-->>Bifrost: selectedKey
else single eligible key
Bifrost-->>Bifrost: selectedKey = sole key
end
Bifrost->>Bifrost: ctx = context.WithValue(BifrostContextKeySelectedKey, selectedKey)
Bifrost->>ProviderQueue: enqueue *ChannelMessage(ctx, req)
ProviderQueue-->>Provider: dequeue *ChannelMessage
Provider->>Provider: makeRequestWithContext(...) -- measures latency -->
Provider-->>Bifrost: response, latency
Bifrost-->>Client: BifrostResponse{ ExtraFields.Latency = latency.Milliseconds() }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
ccd9a7a to
07d817e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/bifrost.go (1)
1535-1557: Fix potential panic and precision loss in WeightedRandomKeySelectorCurrent logic:
- Truncates weights via
int(key.Weight*100)causing precision loss and zeroing small weights.- Panics if
totalWeight <= 0(rand.Intn(0)).- Doesn’t handle negative/zero weights explicitly.
Refactor to float64 accumulation, ignore non-positive weights, and fallback to uniform if all weights are non-positive:
-func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { - // Use a weighted random selection based on key weights - totalWeight := 0 - for _, key := range keys { - totalWeight += int(key.Weight * 100) // Convert float to int for better performance - } - - // Use a fast random number generator - randomSource := rand.New(rand.NewSource(time.Now().UnixNano())) - randomValue := randomSource.Intn(totalWeight) - - // Select key based on weight - currentWeight := 0 - for _, key := range keys { - currentWeight += int(key.Weight * 100) - if randomValue < currentWeight { - return key, nil - } - } - - // Fallback to first key if something goes wrong - return keys[0], nil -} +func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { + // Accumulate only positive weights + var sum float64 + weights := make([]float64, len(keys)) + for i, k := range keys { + w := k.Weight + if w > 0 { + weights[i] = w + sum += w + } + } + + rng := rand.New(rand.NewSource(time.Now().UnixNano())) + + // If all weights are non-positive, choose uniformly + if sum <= 0 { + return keys[rng.Intn(len(keys))], nil + } + + // Weighted draw in float space + target := rng.Float64() * sum + var acc float64 + for i, w := range weights { + acc += w + if target < acc { + return keys[i], nil + } + } + // Numerical safety fallback + return keys[len(keys)-1], nil +}
🧹 Nitpick comments (6)
core/schemas/bifrost.go (3)
15-16: Public KeySelector API: clarify ctx mutability intentThe pointer to context is consistent with existing interfaces here, but it’s atypical in Go. If the intent is to allow selectors to enrich ctx, keep it; otherwise, prefer
context.Context(non-pointer). Add a brief doc comment stating the expectation.
24-28: Config hook looks good; add short doc/example
KeySelectoron BifrostConfig is clear. Consider a brief example in README/docs to show custom selector usage and defaults.
105-107: Selected key context key is useful; verify visibility in non-stream flowsSetting
BifrostContextKeySelectedKeyenables downstream access. Note: in non-stream requests, PostHooks run intryRequestwith the original ctx, not the worker-updated one—so plugins may not see this value. See companion comment incore/bifrost.gofor options.core/bifrost.go (3)
93-95: Sane defaultingFalling back to
WeightedRandomKeySelectorwhen not supplied is good. Optional: log at Info once during init when defaulting, to aid config debugging.
1184-1185: Selected key ID is set on ctx inside worker; PostHooks for non-stream won’t see itNon-stream PostHooks in
tryRequestuse the pre-worker ctx, so plugins can’t readbifrost-key-selected. Options (pick one):
- Move key selection earlier (prior to enqueuing) and set the ctx there; or
- Return the selected key ID alongside the response and attach it to ctx before PostHooks; or
- Add
SelectedKeyID *stringtoBifrostResponseExtraFieldsand populate it here, so PostHooks can read it fromresp.I can draft a small patch for option 3 if you prefer.
1526-1533: Delegation to pluggable selector: good; add guardrailsIf a custom selector returns an empty/zero-value key or error, upstream handling is fine. Consider documenting that selectors must return a key from the provided
keysslice and may not mutate it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
core/bifrost.go(5 hunks)core/schemas/bifrost.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
core/schemas/bifrost.go (2)
core/schemas/account.go (2)
Key(8-16)Account(47-63)core/schemas/plugin.go (1)
Plugin(43-63)
core/bifrost.go (2)
core/schemas/account.go (2)
Account(47-63)Key(8-16)core/schemas/bifrost.go (3)
KeySelector(15-15)BifrostContextKeySelectedKey(106-106)ModelProvider(42-42)
🔇 Additional comments (1)
core/bifrost.go (1)
35-49: Struct wiring LGTMAdding
keySelectorto Bifrost is straightforward and keeps responsibilities local.
07d817e to
34c87f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
core/schemas/bifrost.go (1)
729-737: Potential breaking change: Latency type/semantics changedLatency changed from pointer float64 to int64 and from pointer semantics (nil-omit) to value (0-omit). This alters the public Go API and may break downstream code that imports schemas directly. The PR declares “No breaking changes,” so consider keeping the original field and adding a new ms field instead, or reverting.
Option A (non-breaking): keep old field and add a new one
type BifrostResponseExtraFields struct { Provider ModelProvider `json:"provider"` Params ModelParameters `json:"model_params"` - Latency int64 `json:"latency,omitempty"` // in milliseconds + Latency *float64 `json:"latency,omitempty"` // preserved for BC (ms) + LatencyMs int64 `json:"latency_ms,omitempty"` // preferred going forward ChatHistory *[]BifrostMessage `json:"chat_history,omitempty"` BilledUsage *BilledLLMUsage `json:"billed_usage,omitempty"` ChunkIndex int `json:"chunk_index"` RawResponse interface{} `json:"raw_response,omitempty"` CacheDebug *BifrostCacheDebug `json:"cache_debug,omitempty"` }Option B (revert to preserve BC):
- Latency int64 `json:"latency,omitempty"` // in milliseconds + Latency *float64 `json:"latency,omitempty"` // in millisecondsIf you pick A, please populate both fields for one release cycle; if B, update providers to assign a float64 pointer.
core/providers/utils.go (2)
138-178: Critical: data race/use‑after‑free on ctx cancellation in makeRequestWithContextEarly return on ctx.Done() while client.Do is still running in a goroutine can race with callers’ deferred fasthttp.ReleaseRequest/ReleaseResponse, leading to use‑after‑free and memory corruption. Call sites acquire resp/req and defer release before invoking this function; returning early frees the objects while the goroutine continues writing into them.
Fix by avoiding the background goroutine and using DoDeadline (respecting ctx.Deadline) so the call remains synchronous with well-defined object lifetime.
Apply this diff:
func makeRequestWithContext(ctx context.Context, client *fasthttp.Client, req *fasthttp.Request, resp *fasthttp.Response) (time.Duration, *schemas.BifrostError) { - startTime := time.Now() - errChan := make(chan error, 1) - - go func() { - // client.Do is a blocking call. - // It will send an error (or nil for success) to errChan when it completes. - errChan <- client.Do(req, resp) - }() - - select { - case <-ctx.Done(): - // Context was cancelled (e.g., deadline exceeded or manual cancellation). - // Calculate latency even for cancelled requests - latency := time.Since(startTime) - return latency, &schemas.BifrostError{ - IsBifrostError: true, - Error: schemas.ErrorField{ - Type: Ptr(schemas.RequestCancelled), - Message: fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()), - Error: ctx.Err(), - }, - } - case err := <-errChan: - // The fasthttp.Do call completed. - // Calculate latency for both successful and failed requests - latency := time.Since(startTime) - if err != nil { - // The HTTP request itself failed (e.g., connection error, fasthttp timeout). - return latency, &schemas.BifrostError{ - IsBifrostError: false, - Error: schemas.ErrorField{ - Message: schemas.ErrProviderRequest, - Error: err, - }, - } - } - // HTTP request was successful from fasthttp's perspective (err is nil). - // The caller should check resp.StatusCode() for HTTP-level errors (4xx, 5xx). - return latency, nil - } + startTime := time.Now() + var err error + if deadline, ok := ctx.Deadline(); ok { + err = client.DoDeadline(req, resp, deadline) + } else { + err = client.Do(req, resp) + } + latency := time.Since(startTime) + if err != nil { + return latency, &schemas.BifrostError{ + IsBifrostError: false, + Error: schemas.ErrorField{ + Message: schemas.ErrProviderRequest, + Error: err, + }, + } + } + return latency, nil }Optional alternative (if early-return on cancel is required): allocate/copy local fasthttp.Request/Response inside this function and never touch caller-owned req/resp from the goroutine, then switch to DoDeadline for safety. That change is larger; the above is the simplest safe fix.
138-178: Critical: don't release fasthttp Request/Response while client.Do may still be using themmakeRequestWithContext spawns client.Do(req, resp) in a goroutine and may return on ctx.Done while that goroutine is still running. Call sites AcquireRequest()/AcquireResponse() then defer ReleaseRequest()/ReleaseResponse() before calling it — if ctx cancels, the caller's deferred releases can free pooled objects still in use -> data race / use-after-free.
- Fix options (choose one):
- Make makeRequestWithContext wait for the client.Do goroutine to finish before returning on ctx.Done (read errChan even after ctx.Done) so call-site defers are safe.
- OR transfer ownership: have makeRequestWithContext own and Release the req/resp after client.Do completes and remove call-site defers (update all providers).
Examples of the affected pattern: core/providers/mistral.go:112–115, core/providers/openai.go:128–131, core/providers/azure.go:175–178 (pattern appears across many providers) — do not leave Acquire/Release paired around an async call that can outlive the caller.
core/providers/azure.go (1)
164-172: Update default Azure OpenAI API version
File: core/providers/azure.go (lines 164–172) — default "2024-02-01" is outdated; change to "2024-10-21" or switch to the /openai/v1 GA endpoints (no api-version) to ensure compatibility.core/bifrost.go (4)
127-138: Fix:for range <int>doesn’t compile in Go.Use a counted loop for pool prewarm and worker spin‑up.
Apply these diffs:
- for range config.InitialPoolSize { + for i := 0; i < config.InitialPoolSize; i++ {- for range providerConfig.ConcurrencyAndBufferSize.Concurrency { + for i := 0; i < providerConfig.ConcurrencyAndBufferSize.Concurrency; i++ {- for range providerConfig.ConcurrencyAndBufferSize.Concurrency { + for i := 0; i < providerConfig.ConcurrencyAndBufferSize.Concurrency; i++ {Also applies to: 422-427, 733-738
1227-1231: Bug: non‑streaming path never retries.Unconditional
breakprevents retries for retriable server/network errors.Apply this diff to align with the streaming path:
- result, bifrostError = handleProviderRequest(provider, req, key, req.Type) - if bifrostError != nil { - break // Don't retry client errors - } + result, bifrostError = handleProviderRequest(provider, req, key, req.Type) + if bifrostError != nil && !bifrostError.IsBifrostError { + break // Client error: don't retry + }
394-399: Risk: send on closed channel during concurrency update.Closing
oldQueuebefore swapping the map entry can panic senders that still holdoldQueue. Swap first, then close.Apply this minimal, safer ordering:
- // Step 3: Close the old queue to signal workers to stop - close(oldQueue) - - // Step 4: Atomically replace the queue - bifrost.requestQueues.Store(providerKey, newQueue) + // Step 3: Atomically replace the queue so new requests use the new queue + bifrost.requestQueues.Store(providerKey, newQueue) + + // Step 4: Close the old queue to signal workers to stop + close(oldQueue)Follow‑up: consider counting/migrating late senders or gating sends via an indirection (e.g., a thin queue wrapper) to fully eliminate this class of race.
1537-1559: Harden weighted selection: zero/negative/very small weights can panic.
Intn(totalWeight)panics iftotalWeight <= 0, andint(weight*100)can round small weights to 0.Apply this robust, float‑based selector:
-func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { - // Use a weighted random selection based on key weights - totalWeight := 0 - for _, key := range keys { - totalWeight += int(key.Weight * 100) // Convert float to int for better performance - } - // Use a fast random number generator - randomSource := rand.New(rand.NewSource(time.Now().UnixNano())) - randomValue := randomSource.Intn(totalWeight) - // Select key based on weight - currentWeight := 0 - for _, key := range keys { - currentWeight += int(key.Weight * 100) - if randomValue < currentWeight { - return key, nil - } - } - // Fallback to first key if something goes wrong - return keys[0], nil -} +func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { + // Treat non‑positive weights as 1 to avoid starvation and panics. + total := 0.0 + for _, k := range keys { + w := k.Weight + if w <= 0 { + w = 1 + } + total += w + } + rnd := rand.New(rand.NewSource(time.Now().UnixNano())) + if total <= 0 { + return keys[rnd.Intn(len(keys))], nil // uniform fallback + } + r := rnd.Float64() * total + acc := 0.0 + for _, k := range keys { + w := k.Weight + if w <= 0 { + w = 1 + } + acc += w + if r < acc { + return k, nil + } + } + return keys[len(keys)-1], nil +}
🧹 Nitpick comments (12)
core/schemas/bifrost.go (1)
15-16: Public KeySelector API: clarify contract (empty keys, determinism) and doc commentPlease add a brief Godoc on KeySelector covering:
- Behavior when keys is empty (must return error).
- Whether selection should be deterministic given same inputs (recommended for reproducibility).
Also confirm we consistently pass a non-nil ctx pointer at call sites.core/providers/ollama.go (1)
159-162: Align assignment with BC if Latency remains a pointerIf you keep Latency as a pointer float64 (BC), update assignment:
- response.ExtraFields.Latency = latency.Milliseconds() + latMs := float64(latency.Milliseconds()) + response.ExtraFields.Latency = &latMstransports/bifrost-http/main.go (1)
416-416: Reduce log verbosity for plugin initThese are Info-level and will print on every start; consider Debug to avoid noisy logs in prod:
- logger.Info("initializing plugin: %s", plugin.Name) + logger.Debug("initializing plugin: %s", plugin.Name) - logger.Info("successfully initialized maxim plugin") + logger.Debug("successfully initialized maxim plugin")Also applies to: 432-433
core/providers/groq.go (2)
120-131: Guard Authorization header to avoid sending empty bearerIf key.Value can be empty, skip the header to avoid “Bearer ”:
- req.Header.Set("Authorization", "Bearer "+key.Value) + if key.Value != "" { + req.Header.Set("Authorization", "Bearer "+key.Value) + }
156-159: Align latency assignment with pointer-based field (if reverting for BC)- response.ExtraFields.Latency = latency.Milliseconds() + latMs := float64(latency.Milliseconds()) + response.ExtraFields.Latency = &latMscore/providers/sgl.go (1)
165-167: Align latency assignment with pointer-based field (if reverting for BC)- response.ExtraFields.Latency = latency.Milliseconds() + latMs := float64(latency.Milliseconds()) + response.ExtraFields.Latency = &latMscore/providers/utils.go (1)
133-139: Update comment to reflect new synchronous behaviorAfter removing the goroutine, this function no longer “stops waiting and returns early on ctx cancellation.” It should instead document that it respects ctx.Deadline via DoDeadline and does not preemptively return on cancel without a deadline.
core/providers/mistral.go (1)
138-141: Avoid shadowing bifrostErr in error paths.The short variable declaration bifrostErr := ... shadows the outer bifrostErr, which is easy to misread.
- bifrostErr := handleProviderAPIError(resp, &errorResp) + apiErr := handleProviderAPIError(resp, &errorResp) - bifrostErr.Error.Message = fmt.Sprintf("Mistral error: %v", errorResp) - return nil, bifrostErr + apiErr.Error.Message = fmt.Sprintf("Mistral error: %v", errorResp) + return nil, apiErrApply the same rename in the Embedding block.
Also applies to: 235-238
core/providers/bedrock.go (1)
1028-1034: Use url.PathEscape instead of url.QueryEscape for path segments.QueryEscape is intended for query strings and encodes spaces as '+'. For path components (ARN/inference profile), PathEscape is a better fit and matches usage elsewhere (e.g., embeddings/stream).
-encodedModelIdentifier := url.QueryEscape(fmt.Sprintf("%s/%s", *key.BedrockKeyConfig.ARN, inferenceProfileId)) +encodedModelIdentifier := url.PathEscape(fmt.Sprintf("%s/%s", *key.BedrockKeyConfig.ARN, inferenceProfileId))core/bifrost.go (3)
1450-1471: Avoid retaining request context between pooled messages.
releaseChannelMessagedoesn’t clearmsg.Contextor the request payload, risking accidental retention.Apply:
func (bifrost *Bifrost) releaseChannelMessage(msg *ChannelMessage) { // Put channels back in pools bifrost.responseChannelPool.Put(msg.Response) bifrost.errorChannelPool.Put(msg.Err) @@ if msg.ResponseStream != nil { @@ } // Clear references and return to pool + msg.Context = nil + msg.BifrostRequest = schemas.BifrostRequest{} msg.Response = nil msg.ResponseStream = nil msg.Err = nil bifrost.channelMessagePool.Put(msg) }
1484-1523: Key selection + context propagation LGTM, with one ask.Selector integration and storing
BifrostContextKeySelectedKeyin context look correct. Please document the context key contract in schemas for plugin/consumer use.Also applies to: 1528-1535, 1186-1187
316-316: Logger interface supports varargs — optional cleanup recommended. Logger methods are defined as (msg string, args ...any) so fmt.Sprintf(...) is redundant (harmless); prefer using logger.("format %v", args...) for consistency (e.g. core/bifrost.go:316, 332-333, 391-392, 417-421, 442).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
core/bifrost.go(20 hunks)core/providers/anthropic.go(7 hunks)core/providers/azure.go(10 hunks)core/providers/bedrock.go(16 hunks)core/providers/cerebras.go(4 hunks)core/providers/cohere.go(4 hunks)core/providers/gemini.go(4 hunks)core/providers/groq.go(2 hunks)core/providers/mistral.go(4 hunks)core/providers/ollama.go(2 hunks)core/providers/openai.go(8 hunks)core/providers/openrouter.go(4 hunks)core/providers/parasail.go(2 hunks)core/providers/sgl.go(2 hunks)core/providers/utils.go(5 hunks)core/schemas/bifrost.go(5 hunks)transports/bifrost-http/main.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (15)
core/providers/ollama.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/parasail.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/utils.go (1)
core/schemas/bifrost.go (1)
BifrostError(773-782)
core/providers/gemini.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/mistral.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/openrouter.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/cerebras.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/azure.go (3)
core/schemas/account.go (2)
Key(8-16)AzureKeyConfig(20-24)core/schemas/bifrost.go (2)
BifrostError(773-782)Azure(46-46)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(24-24)
core/providers/cohere.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/groq.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/sgl.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/schemas/bifrost.go (2)
core/schemas/account.go (2)
Key(8-16)Account(47-63)core/schemas/plugin.go (1)
Plugin(43-63)
core/bifrost.go (3)
core/schemas/account.go (2)
Account(47-63)Key(8-16)core/schemas/bifrost.go (3)
KeySelector(15-15)ModelProvider(42-42)BifrostContextKeySelectedKey(108-108)core/schemas/provider.go (3)
ConcurrencyAndBufferSize(51-54)Provider(202-221)ProviderConfig(143-151)
core/providers/bedrock.go (2)
core/schemas/account.go (1)
BedrockKeyConfig(36-43)core/schemas/bifrost.go (2)
BifrostError(773-782)ErrorField(790-797)
core/providers/anthropic.go (2)
core/schemas/bifrost.go (1)
BifrostError(773-782)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(24-24)
🔇 Additional comments (16)
core/schemas/bifrost.go (2)
24-28: Config wiring looks goodKeySelector on BifrostConfig is clear and aligns with the defaulting to WeightedRandomKeySelector. LGTM.
107-113: Selected-key context key: verify propagation across all pathsStoring only the Key.ID (not the secret) is good. Please verify this is set for both streaming and non‑streaming flows and is observable in hooks/telemetry as intended.
core/providers/parasail.go (1)
128-128: Latency capture and propagation — LGTMCapturing the duration from makeRequestWithContext and surfacing it via ExtraFields.Latency is correct and consistent with the new schema.
Also applies to: 158-158
core/providers/cohere.go (1)
246-246: Latency propagation — LGTMBoth ChatCompletion and Embedding correctly capture latency and expose it as milliseconds in ExtraFields.
Also applies to: 336-336, 666-666, 717-717
core/providers/cerebras.go (1)
149-149: Latency propagation — LGTMBoth text and chat paths correctly capture latency and expose it as milliseconds in ExtraFields.
Also applies to: 212-213, 289-289
core/providers/openrouter.go (1)
145-145: Latency propagation — LGTMText and chat paths correctly capture and surface latency in ExtraFields.
Also applies to: 196-196, 244-244, 269-269
core/providers/openai.go (1)
145-145: Latency propagation — LGTMChatCompletion, Embedding, Speech, and Transcription correctly capture latency and set ExtraFields.Latency (ms). Consistent with schema changes.
Also applies to: 178-179, 308-308, 616-617, 830-830, 865-865
core/providers/mistral.go (1)
128-132: Latency propagation looks correct.makeRequestWithContext now returns latency and it’s stored in ExtraFields.Latency (ms). Matches the pattern used across providers.
Also applies to: 157-158
core/providers/gemini.go (2)
171-175: Latency capture and exposure: LGTM.latency from makeRequestWithContext is written to ExtraFields.Latency (ms). Implementation is consistent with other providers.
Also applies to: 209-211
1090-1094: Good: completeRequest returns latency and sets it on the base response.Creates a single place to standardize latency for Gemini non-streaming calls. Looks solid.
Also applies to: 1112-1115
core/providers/bedrock.go (2)
262-263: Latency measurement integrated correctly.completeRequest now returns (body, latency). Callers attach latency to responses. No correctness issues spotted.
Also applies to: 309-313, 358-359
839-851: Latency attached to responses: LGTM.Non‑streaming Text/Chat and Titan/Cohere embedding paths consistently set ExtraFields.Latency (ms).
Also applies to: 1123-1125, 1296-1299, 1364-1367
core/providers/anthropic.go (1)
386-387: Latency propagation in Text/Chat: LGTM.ExtraFields.Latency is set from the measured duration in both paths.
Also applies to: 441-442
core/providers/azure.go (1)
233-234: Latency wiring across Text/Chat/Embedding: LGTM.latency is captured and ExtraFields.Latency is set consistently.
Also applies to: 322-324, 377-379
core/bifrost.go (2)
721-739: Provider queue initialization LGTM.Channel typing to
chan *ChannelMessageand worker startup wiring are sound (post counted‑loop fix).
1001-1019: Enqueue path LGTM.Non‑blocking fast path, cancellation handling, and drop‑on‑full behavior are well balanced.
Also applies to: 1119-1137
34c87f1 to
d80798a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/bifrost.go (1)
1938-1960: Fix potential panic and improve random number generation.The implementation has several issues:
- Critical: If all keys have weights that round to 0 (e.g., all weights < 0.01),
totalWeightwill be 0, causingrand.Intn(0)to panic on line 1947.- Major: Creating a new
rand.New()on each call (line 1946) is inefficient and can produce predictable patterns when multiple requests arrive within the same nanosecond.- Minor: Converting
key.Weight * 100tointloses precision for small weights (e.g., 0.001 becomes 0).Apply this diff to address these issues:
+var ( + // Package-level random source for key selection + keySelectionRand = rand.New(rand.NewSource(time.Now().UnixNano())) + keySelectionMu sync.Mutex +) + func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { + // Defensive: handle empty keys + if len(keys) == 0 { + return schemas.Key{}, fmt.Errorf("no keys provided for selection") + } + // Use a weighted random selection based on key weights - totalWeight := 0 + totalWeight := 0.0 for _, key := range keys { - totalWeight += int(key.Weight * 100) // Convert float to int for better performance + if key.Weight < 0 { + return schemas.Key{}, fmt.Errorf("negative weight not allowed: %f", key.Weight) + } + totalWeight += key.Weight } - // Use a fast random number generator - randomSource := rand.New(rand.NewSource(time.Now().UnixNano())) - randomValue := randomSource.Intn(totalWeight) + // If all weights are 0 or very small, use equal probability + if totalWeight < 0.0001 { + keySelectionMu.Lock() + selectedIndex := keySelectionRand.Intn(len(keys)) + keySelectionMu.Unlock() + return keys[selectedIndex], nil + } + + // Use package-level random source with mutex for thread safety + keySelectionMu.Lock() + randomValue := keySelectionRand.Float64() * totalWeight + keySelectionMu.Unlock() // Select key based on weight - currentWeight := 0 + currentWeight := 0.0 for _, key := range keys { - currentWeight += int(key.Weight * 100) + currentWeight += key.Weight if randomValue < currentWeight { return key, nil } } // Fallback to first key if something goes wrong return keys[0], nil }
♻️ Duplicate comments (2)
core/providers/anthropic.go (1)
170-174: Copy the response body before releasingfasthttp.Response.Line 173 returns the slice from
resp.Body(), but the deferredfasthttp.ReleaseResponse(resp)frees that buffer before the caller can use it, leading to use-after-free. Copy into a new slice before returning.- // Read the response body - body := resp.Body() - - return body, latency, nil + // Read the response body + body := append([]byte(nil), resp.Body()...) + + return body, latency, nilcore/providers/azure.go (1)
139-143: Avoid returningresp.Body()after the response is released.Line 142 hands back the fasthttp buffer directly; once the deferred
fasthttp.ReleaseResponse(resp)runs, callers see corrupted data. Copy the body before returning.- // Read the response body - body := resp.Body() - - return body, latency, nil + // Read the response body + body := append([]byte(nil), resp.Body()...) + + return body, latency, nil
🧹 Nitpick comments (1)
framework/streaming/audio.go (1)
70-70: PreferDuration.Milliseconds()for readabilityThe built-in helper conveys intent and keeps precision handling in one place. Please mirror the same tweak in the transcription/chat accumulators for consistency.
Apply this diff:
- data.Latency = accumulator.FinalTimestamp.Sub(accumulator.StartTimestamp).Nanoseconds() / 1e6 + data.Latency = accumulator.FinalTimestamp.Sub(accumulator.StartTimestamp).Milliseconds()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
core/bifrost.go(20 hunks)core/providers/anthropic.go(9 hunks)core/providers/azure.go(10 hunks)core/providers/bedrock.go(18 hunks)core/providers/cohere.go(11 hunks)core/providers/gemini.go(12 hunks)core/providers/openai.go(12 hunks)core/providers/utils.go(6 hunks)core/schemas/bifrost.go(5 hunks)core/schemas/providers/bedrock/chat.go(0 hunks)framework/streaming/audio.go(1 hunks)framework/streaming/chat.go(1 hunks)framework/streaming/transcription.go(1 hunks)framework/streaming/types.go(2 hunks)
💤 Files with no reviewable changes (1)
- core/schemas/providers/bedrock/chat.go
🧰 Additional context used
🧬 Code graph analysis (11)
core/providers/anthropic.go (2)
core/schemas/bifrost.go (1)
BifrostError(450-459)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
core/schemas/bifrost.go (1)
core/schemas/account.go (1)
Key(8-17)
core/providers/openai.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
framework/streaming/types.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
framework/streaming/audio.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
framework/streaming/transcription.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/azure.go (2)
core/schemas/account.go (2)
Key(8-17)AzureKeyConfig(27-31)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
core/bifrost.go (3)
core/schemas/bifrost.go (3)
KeySelector(15-15)ModelProvider(46-46)BifrostContextKeySelectedKey(118-118)core/schemas/provider.go (1)
Provider(194-219)core/schemas/account.go (1)
Key(8-17)
core/providers/utils.go (2)
core/schemas/bifrost.go (1)
BifrostError(450-459)core/schemas/provider.go (1)
ErrProviderRequestTimedOut(22-22)
framework/streaming/chat.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/gemini.go (2)
plugins/mocker/main.go (1)
Latency(125-129)core/schemas/bifrost.go (1)
BifrostError(450-459)
🔇 Additional comments (5)
core/bifrost.go (5)
49-49: LGTM: KeySelector field integration.The keySelector field is properly initialized with a sensible default (
WeightedRandomKeySelector) when not provided in the config. This enables custom key selection strategies while maintaining backward compatibility.Also applies to: 90-90, 95-97
1876-1936: LGTM: Pluggable key selector integration.The key selection logic correctly delegates to the configurable
keySelectorwhen multiple keys support the model, enabling custom selection strategies (round-robin, least-used, etc.). Error handling is appropriate.
1539-1539: LGTM: Selected key tracking via context.Storing the selected key ID in the request context enables downstream tracking and debugging of which API key was used for each request.
635-635: LGTM: Pointer-based queue refactor.The refactor from
chan ChannelMessagetochan *ChannelMessageis consistently applied across all queue operations. This reduces memory allocations and improves performance by avoiding value copies when sending messages through channels.Also applies to: 640-640, 656-656, 1020-1020, 1047-1047, 1053-1053, 1066-1066, 1082-1082, 1345-1345, 1357-1357, 1461-1461, 1473-1473, 1504-1504, 1969-1969
1370-1370: LGTM: Context pointer passed to PostHooks.Correctly passing
&msg.ContexttoRunPostHooksas the function expects a context pointer, allowing plugins to potentially modify the context during post-processing.Also applies to: 1379-1379
d80798a to
e13e8a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/bifrost.go (1)
1939-1960: Guard against zero/empty weights inWeightedRandomKeySelector.
rand.Intn(totalWeight)panics whentotalWeight == 0. This happens if all keys have weight 0 (default) or weights round down afterint(key.Weight * 100). Add a fallback (e.g., treat equal weights whentotalWeight <= 0) before callingIntn.
♻️ Duplicate comments (1)
core/providers/azure.go (1)
139-143: Copy the fasthttp response body before releasing it.
resp.Body()aliases fasthttp's internal buffer, butdefer fasthttp.ReleaseResponse(resp)frees that storage whencompleteRequestreturns. Callers then read from memory that fasthttp may reuse, causing a use-after-free. Copy the bytes before returning.- // Read the response body - body := resp.Body() - - return body, latency, nil + // Read the response body + body := append([]byte(nil), resp.Body()...) + + return body, latency, nil
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/bifrost.go(20 hunks)core/providers/anthropic.go(9 hunks)core/providers/azure.go(10 hunks)core/providers/bedrock.go(18 hunks)core/providers/cohere.go(11 hunks)core/providers/gemini.go(12 hunks)core/providers/openai.go(12 hunks)core/providers/utils.go(6 hunks)core/schemas/bifrost.go(5 hunks)core/schemas/providers/bedrock/chat.go(0 hunks)framework/streaming/audio.go(1 hunks)framework/streaming/chat.go(1 hunks)framework/streaming/transcription.go(1 hunks)framework/streaming/types.go(2 hunks)transports/bifrost-http/main.go(1 hunks)
💤 Files with no reviewable changes (1)
- core/schemas/providers/bedrock/chat.go
🚧 Files skipped from review as they are similar to previous changes (4)
- core/schemas/bifrost.go
- framework/streaming/chat.go
- core/providers/openai.go
- core/providers/cohere.go
🧰 Additional context used
🧬 Code graph analysis (8)
framework/streaming/audio.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/utils.go (1)
core/schemas/bifrost.go (1)
BifrostError(450-459)
core/providers/anthropic.go (2)
core/schemas/bifrost.go (1)
BifrostError(450-459)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
framework/streaming/types.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/bedrock.go (2)
core/schemas/account.go (1)
Key(8-17)core/schemas/bifrost.go (2)
BifrostError(450-459)ErrorField(467-474)
core/providers/azure.go (1)
core/schemas/account.go (2)
Key(8-17)AzureKeyConfig(27-31)
framework/streaming/transcription.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/bifrost.go (2)
core/schemas/bifrost.go (3)
KeySelector(15-15)ModelProvider(46-46)BifrostContextKeySelectedKey(118-118)core/schemas/account.go (1)
Key(8-17)
e13e8a7 to
c511c26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
core/providers/azure.go (1)
139-143: Copy the response body before releasing fasthttp buffers.
resp.Body()aliases fasthttp’s internal buffer. Once the deferredfasthttp.ReleaseResponse(resp)runs, that backing storage is returned to the pool, so callers read freed memory. Copy the bytes before returning. Same bug as flagged previously.- // Read the response body - body := resp.Body() - - return body, latency, nil + // Read the response body + body := append([]byte(nil), resp.Body()...) + + return body, latency, nilcore/providers/gemini.go (1)
835-846: Clone the fasthttp body before releasing the response.
resp.Body()points to fasthttp's pooled buffer. Once the deferredfasthttp.ReleaseResponse(resp)at line 810 executes, the returned slice becomes invalid. Copy the bytes prior to using them in unmarshal calls and returning rawResponse.Apply this diff to clone the body:
- responseBody := resp.Body() + responseBody := append([]byte(nil), resp.Body()...)
🧹 Nitpick comments (1)
core/bifrost.go (1)
1944-1966: Consider reusing a rand source or using math/rand/v2 for better performance.The
WeightedRandomKeySelectorcreates a newrand.Newsource on every invocation. While this works correctly and avoids concurrency issues, it incurs unnecessary allocation overhead. Consider using a shared concurrent-safe random source (e.g., async.Poolof*rand.Randinstances) or migrating tomath/rand/v2which provides concurrent-safe global functions.Example using math/rand/v2 (if available):
import ( cryptorand "crypto/rand" "encoding/binary" "math/rand/v2" ) func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { totalWeight := 0 for _, key := range keys { totalWeight += int(key.Weight * 100) } randomValue := rand.IntN(totalWeight) currentWeight := 0 for _, key := range keys { currentWeight += int(key.Weight * 100) if randomValue < currentWeight { return key, nil } } return keys[0], nil }Alternatively, use a sync.Pool for rand sources:
var randPool = sync.Pool{ New: func() interface{} { var seed int64 binary.Read(cryptorand.Reader, binary.LittleEndian, &seed) return rand.New(rand.NewSource(seed)) }, } func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) { totalWeight := 0 for _, key := range keys { totalWeight += int(key.Weight * 100) } r := randPool.Get().(*rand.Rand) randomValue := r.Intn(totalWeight) randPool.Put(r) currentWeight := 0 for _, key := range keys { currentWeight += int(key.Weight * 100) if randomValue < currentWeight { return key, nil } } return keys[0], nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/bifrost.go(20 hunks)core/providers/anthropic.go(9 hunks)core/providers/azure.go(10 hunks)core/providers/bedrock.go(18 hunks)core/providers/cohere.go(11 hunks)core/providers/gemini.go(12 hunks)core/providers/openai.go(12 hunks)core/providers/utils.go(6 hunks)core/schemas/bifrost.go(5 hunks)core/schemas/providers/bedrock/chat.go(0 hunks)framework/streaming/audio.go(1 hunks)framework/streaming/chat.go(1 hunks)framework/streaming/transcription.go(1 hunks)framework/streaming/types.go(2 hunks)transports/bifrost-http/main.go(1 hunks)
💤 Files with no reviewable changes (1)
- core/schemas/providers/bedrock/chat.go
✅ Files skipped from review due to trivial changes (1)
- transports/bifrost-http/main.go
🚧 Files skipped from review as they are similar to previous changes (4)
- core/providers/openai.go
- framework/streaming/transcription.go
- core/providers/anthropic.go
- core/providers/cohere.go
🧰 Additional context used
🧬 Code graph analysis (9)
framework/streaming/audio.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/azure.go (3)
core/schemas/account.go (2)
Key(8-17)AzureKeyConfig(27-31)core/schemas/bifrost.go (2)
BifrostError(536-545)Azure(50-50)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
core/providers/bedrock.go (3)
core/schemas/account.go (1)
Key(8-17)core/schemas/bifrost.go (2)
BifrostError(536-545)ErrorField(553-560)core/schemas/provider.go (1)
ErrProviderRequestTimedOut(22-22)
core/schemas/bifrost.go (2)
core/schemas/account.go (1)
Key(8-17)ui/lib/types/config.ts (1)
ModelProvider(165-167)
core/providers/utils.go (2)
core/schemas/bifrost.go (1)
BifrostError(536-545)core/schemas/provider.go (1)
ErrProviderRequestTimedOut(22-22)
core/bifrost.go (2)
core/schemas/bifrost.go (3)
KeySelector(15-15)ModelProvider(46-46)BifrostContextKeySelectedKey(118-118)core/schemas/account.go (1)
Key(8-17)
core/providers/gemini.go (2)
plugins/mocker/main.go (1)
Latency(125-129)core/schemas/bifrost.go (1)
BifrostError(536-545)
framework/streaming/types.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
framework/streaming/chat.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
🔇 Additional comments (5)
core/providers/gemini.go (2)
122-122: LGTM!The latency tracking integration is correct. The
makeRequestWithContextcall now returns latency, which is properly captured and used downstream.
160-160: LGTM!Latency is correctly converted to milliseconds and stored in
ExtraFields.Latencyfor non-streaming chat completion responses.core/bifrost.go (3)
90-97: LGTM!The
keySelectorfield is properly initialized from the config with a sensible default fallback toWeightedRandomKeySelectorwhen not provided. This provides a good balance between flexibility and usability.
641-646: LGTM!The refactor from
chan ChannelMessagetochan *ChannelMessageis applied consistently across queue types, function signatures, type assertions, and operations. This change reduces memory allocation overhead for message passing.Also applies to: 1026-1026, 1053-1053, 1510-1510, 1975-1975
1545-1545: LGTM!Storing the selected key ID in the request context enables tracking and debugging of which key was used for each request. This is a valuable addition for observability.
c511c26 to
ebd668c
Compare
ebd668c to
555c4fc
Compare
Merge activity
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/providers/openai.go (1)
988-1005: Copy speech audio before releasingfasthttp.Response.
fasthttp.Response.Body()points to an internal buffer that is recycled oncefasthttp.ReleaseResponse(resp)runs (triggered by the deferred release at function exit). Returning that slice leads to use-after-free/corrupted audio. Copy the bytes before releasing the response.- audioData := resp.Body() + audioData := append([]byte(nil), resp.Body()...)
🧹 Nitpick comments (1)
core/bifrost.go (1)
1944-1966: Inefficient random number generation on every call.Creating a new
rand.New(rand.NewSource(time.Now().UnixNano()))on each call is inefficient and can produce poor randomness when called rapidly (same nanosecond). Consider using a package-level random generator or the globalrandfunctions.Apply this diff to use the global random generator:
- // Use a fast random number generator - randomSource := rand.New(rand.NewSource(time.Now().UnixNano())) - randomValue := randomSource.Intn(totalWeight) + // Use the global random number generator + randomValue := rand.Intn(totalWeight)Note: The global
randpackage is already seeded by default in Go 1.20+. For earlier versions, you can seed it once during package initialization.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
core/bifrost.go(20 hunks)core/providers/anthropic.go(9 hunks)core/providers/azure.go(10 hunks)core/providers/bedrock.go(18 hunks)core/providers/cohere.go(11 hunks)core/providers/gemini.go(12 hunks)core/providers/openai.go(12 hunks)core/providers/utils.go(6 hunks)core/schemas/bifrost.go(5 hunks)core/schemas/providers/bedrock/chat.go(0 hunks)framework/streaming/audio.go(1 hunks)framework/streaming/chat.go(1 hunks)framework/streaming/transcription.go(1 hunks)framework/streaming/types.go(2 hunks)tests/core-providers/custom_test.go(1 hunks)tests/core-providers/scenarios/chat_completion_stream.go(1 hunks)tests/core-providers/scenarios/complete_end_to_end.go(1 hunks)tests/core-providers/scenarios/cross_provider_scenarios.go(1 hunks)tests/core-providers/scenarios/end_to_end_tool_calling.go(1 hunks)tests/core-providers/scenarios/multi_turn_conversation.go(1 hunks)tests/core-providers/scenarios/multiple_images.go(1 hunks)tests/core-providers/scenarios/utils.go(5 hunks)transports/bifrost-http/main.go(1 hunks)
💤 Files with no reviewable changes (1)
- core/schemas/providers/bedrock/chat.go
✅ Files skipped from review due to trivial changes (1)
- transports/bifrost-http/main.go
🚧 Files skipped from review as they are similar to previous changes (5)
- framework/streaming/transcription.go
- core/providers/cohere.go
- core/schemas/bifrost.go
- framework/streaming/chat.go
- framework/streaming/audio.go
🧰 Additional context used
🧬 Code graph analysis (9)
framework/streaming/types.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
tests/core-providers/scenarios/utils.go (1)
ui/lib/types/logs.ts (1)
ChatMessageContent(111-111)
core/providers/bedrock.go (2)
core/schemas/bifrost.go (2)
BifrostError(536-545)ErrorField(553-560)core/schemas/provider.go (1)
ErrProviderRequestTimedOut(22-22)
core/providers/azure.go (3)
core/schemas/account.go (2)
Key(8-17)AzureKeyConfig(27-31)core/schemas/bifrost.go (2)
BifrostError(536-545)Azure(50-50)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
core/providers/anthropic.go (2)
core/schemas/bifrost.go (1)
BifrostError(536-545)core/schemas/provider.go (1)
ErrProviderJSONMarshaling(26-26)
core/providers/openai.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/providers/utils.go (2)
core/schemas/bifrost.go (1)
BifrostError(536-545)core/schemas/provider.go (1)
ErrProviderRequestTimedOut(22-22)
core/providers/gemini.go (1)
plugins/mocker/main.go (1)
Latency(125-129)
core/bifrost.go (3)
core/schemas/bifrost.go (3)
KeySelector(15-15)ModelProvider(46-46)BifrostContextKeySelectedKey(118-118)core/schemas/provider.go (1)
Provider(194-219)core/schemas/account.go (1)
Key(8-17)
🔇 Additional comments (19)
tests/core-providers/scenarios/utils.go (4)
221-224: LGTM!The migration to pointer-based
Contentis correct. Creating&schemas.ChatMessageContent{...}ensures the Content field is properly initialized as a pointer.
240-245: LGTM!The pointer-based
Contentinitialization is correct for image messages with ContentBlocks.
269-271: LGTM!The pointer-based
Contentinitialization is correct for tool messages.
306-344: LGTM!The nil-safety checks are comprehensive and correctly handle the pointer-based
Contentfield. The logic properly checks for nil before accessingContentStrandContentBlocks, and assembles text from blocks when needed.tests/core-providers/scenarios/multiple_images.go (1)
31-31: LGTM!The pointer-based
Contentinitialization aligns with the broader migration to pointer types for message content.tests/core-providers/scenarios/chat_completion_stream.go (1)
150-155: LGTM!The pointer-based construction of
MessageandContentin the consolidated response is correct and safe. This aligns with the pointer-based content model used throughout the test suite.tests/core-providers/custom_test.go (1)
108-110: LGTM!The pointer-based
Contentinitialization is consistent with the broader test suite migration to pointer types.core/providers/azure.go (4)
65-143: LGTM! Latency tracking and use-after-free fix correctly implemented.The function signature correctly returns latency alongside the response body, all error paths return appropriate latency values (0 for early errors, measured latency for later errors), and the response body is copied before releasing the fasthttp response to prevent use-after-free.
156-171: LGTM! Latency correctly captured and propagated.The latency from
completeRequestis correctly captured and assigned toresponse.ExtraFields.Latencyin milliseconds.
246-263: LGTM! Latency correctly captured and propagated.The latency from
completeRequestis correctly captured and assigned toresponse.ExtraFields.Latencyin milliseconds.
302-316: LGTM! Latency correctly captured and propagated.The latency from
completeRequestis correctly captured and assigned toresponse.ExtraFields.Latencyin milliseconds.core/providers/gemini.go (4)
122-160: LGTM! Latency correctly captured and propagated.The latency from
makeRequestWithContextis correctly captured and assigned toresponse.ExtraFields.Latencyin milliseconds.
256-267: LGTM! Latency correctly captured and propagated.The latency from
completeRequestis correctly captured and assigned tobifrostResponse.ExtraFields.Latencyin milliseconds.
517-528: LGTM! Latency correctly captured and propagated.The latency from
completeRequestis correctly captured and assigned tobifrostResponse.ExtraFields.Latencyin milliseconds.
804-850: LGTM! Latency tracking and use-after-free fix correctly implemented.The function correctly captures latency from
makeRequestWithContext, copies the response body before releasing the fasthttp response to prevent use-after-free, and propagates latency through all return paths.core/bifrost.go (4)
49-50: LGTM! Custom key selector correctly initialized.The
keySelectorfield is correctly added to theBifroststruct and initialized from config with a sensible default toWeightedRandomKeySelector.Also applies to: 90-97
641-641: LGTM! Queue type migration to pointers is consistent.All queue type changes from
chan ChannelMessagetochan *ChannelMessageare applied consistently across queue creation, transfers, send operations, and shutdown.Also applies to: 646-646, 662-662, 1026-1026, 1053-1053, 1059-1059, 1072-1072, 1088-1088, 1351-1351, 1363-1363, 1467-1467, 1479-1479, 1975-1975
1545-1545: LGTM! Selected key ID correctly stored in context.The selected key ID is correctly stored in the request context for tracking and debugging.
1882-1942: LGTM! Key selector delegation correctly implemented.The function correctly delegates to the pluggable
keySelectorafter filtering keys by model and deployment support.
| if result1.ChatCompletionsResponse.Choices != nil { | ||
| for _, choice := range result1.ChatCompletionsResponse.Choices { | ||
| chatConversationHistory = append(chatConversationHistory, choice.Message) | ||
| chatConversationHistory = append(chatConversationHistory, *choice.Message) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add nil check before dereferencing pointer.
Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.
Apply this diff to add nil safety:
if result1.ChatCompletionsResponse.Choices != nil {
for _, choice := range result1.ChatCompletionsResponse.Choices {
+ if choice.Message != nil {
chatConversationHistory = append(chatConversationHistory, *choice.Message)
+ }
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if result1.ChatCompletionsResponse.Choices != nil { | |
| for _, choice := range result1.ChatCompletionsResponse.Choices { | |
| chatConversationHistory = append(chatConversationHistory, choice.Message) | |
| chatConversationHistory = append(chatConversationHistory, *choice.Message) | |
| } | |
| } | |
| if result1.ChatCompletionsResponse.Choices != nil { | |
| for _, choice := range result1.ChatCompletionsResponse.Choices { | |
| if choice.Message != nil { | |
| chatConversationHistory = append(chatConversationHistory, *choice.Message) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In tests/core-providers/scenarios/complete_end_to_end.go around lines 120 to
124, the loop dereferences choice.Message without checking for nil which can
panic; update the loop to check that choice.Message is not nil before appending
(i.e., if choice.Message != nil { chatConversationHistory =
append(chatConversationHistory, *choice.Message) }) so only non-nil messages are
dereferenced and added.
| // Use Chat API choices | ||
| for _, choice := range response.Choices { | ||
| conversationHistory = append(conversationHistory, choice.Message) | ||
| conversationHistory = append(conversationHistory, *choice.Message) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add nil check before dereferencing pointer.
Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.
Apply this diff to add nil safety:
} else {
// Use Chat API choices
for _, choice := range response.Choices {
+ if choice.Message != nil {
conversationHistory = append(conversationHistory, *choice.Message)
+ }
}
}🤖 Prompt for AI Agents
In tests/core-providers/scenarios/cross_provider_scenarios.go around lines 697
to 701, the loop dereferences choice.Message without checking for nil which can
cause a panic; update the loop to check if choice.Message != nil before
appending to conversationHistory and skip (or handle) cases where Message is nil
so you never dereference a nil pointer.
| if result1.ChatCompletionsResponse.Choices != nil { | ||
| for _, choice := range result1.ChatCompletionsResponse.Choices { | ||
| chatConversationMessages = append(chatConversationMessages, choice.Message) | ||
| chatConversationMessages = append(chatConversationMessages, *choice.Message) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add nil check before dereferencing pointer.
Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.
Apply this diff to add nil safety:
if result1.ChatCompletionsResponse.Choices != nil {
for _, choice := range result1.ChatCompletionsResponse.Choices {
+ if choice.Message != nil {
chatConversationMessages = append(chatConversationMessages, *choice.Message)
+ }
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if result1.ChatCompletionsResponse.Choices != nil { | |
| for _, choice := range result1.ChatCompletionsResponse.Choices { | |
| chatConversationMessages = append(chatConversationMessages, choice.Message) | |
| chatConversationMessages = append(chatConversationMessages, *choice.Message) | |
| } | |
| } | |
| if result1.ChatCompletionsResponse.Choices != nil { | |
| for _, choice := range result1.ChatCompletionsResponse.Choices { | |
| if choice.Message != nil { | |
| chatConversationMessages = append(chatConversationMessages, *choice.Message) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In tests/core-providers/scenarios/end_to_end_tool_calling.go around lines 132 to
136, the loop dereferences choice.Message directly which can panic if Message is
nil; add a nil check (if choice.Message != nil) before appending to
chatConversationMessages so only non-nil Message pointers are dereferenced and
appended.
| if response1.Choices != nil { | ||
| for _, choice := range response1.Choices { | ||
| messages2 = append(messages2, choice.Message) | ||
| messages2 = append(messages2, *choice.Message) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add nil check before dereferencing pointer.
Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.
Apply this diff to add nil safety:
if response1.Choices != nil {
for _, choice := range response1.Choices {
+ if choice.Message != nil {
messages2 = append(messages2, *choice.Message)
+ }
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if response1.Choices != nil { | |
| for _, choice := range response1.Choices { | |
| messages2 = append(messages2, choice.Message) | |
| messages2 = append(messages2, *choice.Message) | |
| } | |
| } | |
| if response1.Choices != nil { | |
| for _, choice := range response1.Choices { | |
| if choice.Message != nil { | |
| messages2 = append(messages2, *choice.Message) | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In tests/core-providers/scenarios/multi_turn_conversation.go around lines 75 to
79, the loop dereferences choice.Message without checking for nil which can
panic; update the loop to skip nil Message pointers by adding a guard (if
choice.Message == nil { continue }) before appending, so only non-nil messages
are dereferenced and appended to messages2.

Summary
Implement a customizable key selection strategy for Bifrost by introducing a KeySelector interface that allows users to define their own logic for selecting API keys.
Changes
KeySelectorfunction type in schemas to allow custom key selection strategieskeySelectorfield to the Bifrost struct to store the custom selectorWeightedRandomKeySelectorfunctionType of change
Affected areas
How to test
You can test by implementing a custom key selector and passing it to the BifrostConfig:
Breaking changes
Related issues
Enables more flexible key selection strategies for different use cases like round-robin, least-used, or priority-based selection.
Security considerations
The key selection strategy could potentially impact rate limiting and usage patterns across API keys, but doesn't introduce new security concerns.
Checklist