feat: add custom key selector and request latency tracking #449

Pratham-Mishra04 · 2025-09-09T14:03:18Z

Summary

Implement a customizable key selection strategy for Bifrost by introducing a KeySelector interface that allows users to define their own logic for selecting API keys.

Changes

Added a KeySelector function type in schemas to allow custom key selection strategies
Added a keySelector field to the Bifrost struct to store the custom selector
Extracted the default weighted random selection logic into a standalone WeightedRandomKeySelector function
Added the selected key ID to the request context for tracking and debugging purposes
Made the key selection configurable through the BifrostConfig

Type of change

Affected areas

How to test

# Core/Transports
go version
go test ./...

You can test by implementing a custom key selector and passing it to the BifrostConfig:

customSelector := func(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
    // Your custom selection logic here
    return keys[0], nil
}

config := schemas.BifrostConfig{
    // Other config...
    KeySelector: customSelector,
}

bifrost, err := core.Init(context.Background(), config)

Breaking changes

Yes
No

Related issues

Enables more flexible key selection strategies for different use cases like round-robin, least-used, or priority-based selection.

Security considerations

The key selection strategy could potentially impact rate limiting and usage patterns across API keys, but doesn't introduce new security concerns.

Checklist

I added/updated tests where appropriate
I verified builds succeed (Go and UI)

coderabbitai · 2025-09-09T14:03:26Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added configurable key selection with a default weighted-random strategy when multiple keys are available.
- Responses now include per-request latency in milliseconds across major providers (chat, text, embeddings, speech, transcription).
- Streaming outputs report consolidated latency in milliseconds.
Bug Fixes
- Standardized latency representation to integer milliseconds across responses and streams for consistent client handling.
Tests
- Updated scenarios to support pointer-based message/content handling and improve robustness when assembling conversation history and extracting content.

Walkthrough

Adds a pluggable key selector (configurable, defaults to weighted random), stores the selected key in request context, migrates provider queues and ChannelMessage handling to pointer channels, and standardizes per-request latency propagation as int64 milliseconds across providers and streaming paths.

Changes

Cohort / File(s)	Summary
Core engine: key selection & queue pointer migration `core/bifrost.go`	Adds `keySelector` field and `BifrostConfig.KeySelector`; initializes default `WeightedRandomKeySelector`; integrates key selection into request flow and context (`BifrostContextKeySelectedKey`); converts provider queues and ChannelMessage handling to `chan *ChannelMessage`; updates workers, enqueue/dequeue, shutdown, and related signatures.
Core schemas & context keys `core/schemas/bifrost.go`	Adds `KeySelector` type and `BifrostConfig.KeySelector`; introduces `BifrostContextKeySelectedKey`; changes `BifrostResponseExtraFields.Latency` from `*float64` to `int64` (milliseconds).
Request transport utility: latency measurement `core/providers/utils.go`	`makeRequestWithContext` now returns `(time.Duration, *schemas.BifrostError)` and measures/returns latency for success, cancellation, timeouts, and error paths.
Providers: unified latency propagation (Anthropic, Azure, Bedrock, Cohere, Gemini, OpenAI) `core/providers/anthropic.go`, `core/providers/azure.go`, `core/providers/bedrock.go`, `core/providers/cohere.go`, `core/providers/gemini.go`, `core/providers/openai.go`	`completeRequest` / internal helpers updated to return latency `(time.Duration, ...)`; all non-streaming call sites capture latency and set `ExtraFields.Latency = latency.Milliseconds()`; error paths updated to propagate latency where applicable; Azure defaults APIVersion when nil.
Provider-specific schema transform `core/schemas/providers/bedrock/chat.go`	Removed redundant local latency computation in Bedrock-to-Bifrost transform — latency now surfaced from provider flows.
Streaming framework: integer ms latency `framework/streaming/types.go`, `framework/streaming/chat.go`, `framework/streaming/audio.go`, `framework/streaming/transcription.go`	Switches streaming latency from float to `int64` milliseconds, updates calculations and `ToBifrostResponse` assignment to use raw `int64` latency.
Tests: pointer/value adjustments & nil-safety `tests/core-providers/`, `tests/core-providers/scenarios/`	Many tests updated to use pointer types for `ChatMessage.Content` and `ChatMessage` fields (or dereference when storing), plus added nil-guards and string assembly in helpers (`tests/.../utils.go`) to handle pointer-based content.
Transport init (minor) `transports/bifrost-http/main.go`	Minor whitespace/logging adjustment in init; no behavior change.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Bifrost
  participant KeySelector
  participant ProviderQueue
  participant Provider

  Note over Client,Bifrost: Incoming request (model, provider)
  Client->>Bifrost: handleRequest(ctx, req)
  Bifrost->>Bifrost: determine eligible keys (apply provider constraints)
  alt multiple eligible keys
    Bifrost->>KeySelector: KeySelector(ctx, keys, provider, model)
    KeySelector-->>Bifrost: selectedKey
  else single eligible key
    Bifrost-->>Bifrost: selectedKey = sole key
  end
  Bifrost->>Bifrost: ctx = context.WithValue(BifrostContextKeySelectedKey, selectedKey)
  Bifrost->>ProviderQueue: enqueue *ChannelMessage(ctx, req)
  ProviderQueue-->>Provider: dequeue *ChannelMessage
  Provider->>Provider: makeRequestWithContext(...)  -- measures latency -->
  Provider-->>Bifrost: response, latency
  Bifrost-->>Client: BifrostResponse{ ExtraFields.Latency = latency.Milliseconds() }

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Poem

I nibble keys by weighted chance, then tuck them in the thread,
Milliseconds counted clean and neat, a tidy int, not spread.
Pointers hop along the queues, context bears the seed,
I stream and stitch and marshal bytes — a small, precise good deed. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The description fully documents the customizable key selector feature but omits any mention of the extensive request latency tracking changes across providers and does not update the “Affected areas” to include Providers/Integrations. It also lacks testing instructions for validating latency metrics, which is a significant part of the implemented functionality. As a result, the description does not reflect the full scope of the pull request.	Please expand the description’s “Changes” section to cover request latency tracking modifications in provider implementations, update “Affected areas” to include Providers/Integrations, and add steps or examples for testing the latency-tracking functionality.
Docstring Coverage	⚠️ Warning	Docstring coverage is 65.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly captures both major features introduced by this pull request—custom key selection and request latency tracking—using clear, concise language without extraneous details. It accurately reflects the scope and adheres to the repository’s style for feature titles. Therefore, it provides sufficient context for teammates scanning the history.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 08-30-feat_core_extended

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Pratham-Mishra04 · 2025-09-09T14:03:37Z

feat: added migrator to log store #612
feat: add ResponsesMessage support to logging #573
feat: add custom key selector and request latency tracking #449 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/bifrost.go (1)

1535-1557: Fix potential panic and precision loss in WeightedRandomKeySelector

Current logic:

Truncates weights via int(key.Weight*100) causing precision loss and zeroing small weights.
Panics if totalWeight <= 0 (rand.Intn(0)).
Doesn’t handle negative/zero weights explicitly.

Refactor to float64 accumulation, ignore non-positive weights, and fallback to uniform if all weights are non-positive:

-func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
-	// Use a weighted random selection based on key weights
-	totalWeight := 0
-	for _, key := range keys {
-		totalWeight += int(key.Weight * 100) // Convert float to int for better performance
-	}
-
-	// Use a fast random number generator
-	randomSource := rand.New(rand.NewSource(time.Now().UnixNano()))
-	randomValue := randomSource.Intn(totalWeight)
-
-	// Select key based on weight
-	currentWeight := 0
-	for _, key := range keys {
-		currentWeight += int(key.Weight * 100)
-		if randomValue < currentWeight {
-			return key, nil
-		}
-	}
-
-	// Fallback to first key if something goes wrong
-	return keys[0], nil
-}
+func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
+	// Accumulate only positive weights
+	var sum float64
+	weights := make([]float64, len(keys))
+	for i, k := range keys {
+		w := k.Weight
+		if w > 0 {
+			weights[i] = w
+			sum += w
+		}
+	}
+
+	rng := rand.New(rand.NewSource(time.Now().UnixNano()))
+
+	// If all weights are non-positive, choose uniformly
+	if sum <= 0 {
+		return keys[rng.Intn(len(keys))], nil
+	}
+
+	// Weighted draw in float space
+	target := rng.Float64() * sum
+	var acc float64
+	for i, w := range weights {
+		acc += w
+		if target < acc {
+			return keys[i], nil
+		}
+	}
+	// Numerical safety fallback
+	return keys[len(keys)-1], nil
+}

🧹 Nitpick comments (6)

core/schemas/bifrost.go (3)

15-16: Public KeySelector API: clarify ctx mutability intent

The pointer to context is consistent with existing interfaces here, but it’s atypical in Go. If the intent is to allow selectors to enrich ctx, keep it; otherwise, prefer context.Context (non-pointer). Add a brief doc comment stating the expectation.

24-28: Config hook looks good; add short doc/example

KeySelector on BifrostConfig is clear. Consider a brief example in README/docs to show custom selector usage and defaults.

105-107: Selected key context key is useful; verify visibility in non-stream flows

Setting BifrostContextKeySelectedKey enables downstream access. Note: in non-stream requests, PostHooks run in tryRequest with the original ctx, not the worker-updated one—so plugins may not see this value. See companion comment in core/bifrost.go for options.

core/bifrost.go (3)

93-95: Sane defaulting

Falling back to WeightedRandomKeySelector when not supplied is good. Optional: log at Info once during init when defaulting, to aid config debugging.

1184-1185: Selected key ID is set on ctx inside worker; PostHooks for non-stream won’t see it

Non-stream PostHooks in tryRequest use the pre-worker ctx, so plugins can’t read bifrost-key-selected. Options (pick one):

Move key selection earlier (prior to enqueuing) and set the ctx there; or

Return the selected key ID alongside the response and attach it to ctx before PostHooks; or

Add SelectedKeyID *string to BifrostResponseExtraFields and populate it here, so PostHooks can read it from resp.

I can draft a small patch for option 3 if you prefer.

1526-1533: Delegation to pluggable selector: good; add guardrails

If a custom selector returns an empty/zero-value key or error, upstream handling is fine. Consider documenting that selectors must return a key from the provided keys slice and may not mutate it.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 78341ef and 07d817e.

📒 Files selected for processing (2)

core/bifrost.go (5 hunks)
core/schemas/bifrost.go (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

core/schemas/bifrost.go (2)

core/schemas/account.go (2)

Key (8-16)

Account (47-63)

core/schemas/plugin.go (1)

Plugin (43-63)

core/bifrost.go (2)

core/schemas/account.go (2)

Account (47-63)

Key (8-16)

core/schemas/bifrost.go (3)

KeySelector (15-15)

BifrostContextKeySelectedKey (106-106)

ModelProvider (42-42)

🔇 Additional comments (1)

core/bifrost.go (1)

35-49: Struct wiring LGTM

Adding keySelector to Bifrost is straightforward and keeps responsibilities local.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)

core/schemas/bifrost.go (1)
729-737: Potential breaking change: Latency type/semantics changed

Latency changed from pointer float64 to int64 and from pointer semantics (nil-omit) to value (0-omit). This alters the public Go API and may break downstream code that imports schemas directly. The PR declares “No breaking changes,” so consider keeping the original field and adding a new ms field instead, or reverting.

Option A (non-breaking): keep old field and add a new one
 type BifrostResponseExtraFields struct {
   Provider    ModelProvider      `json:"provider"`
   Params      ModelParameters    `json:"model_params"`
-  Latency     int64              `json:"latency,omitempty"` // in milliseconds
+  Latency     *float64           `json:"latency,omitempty"`      // preserved for BC (ms)
+  LatencyMs   int64              `json:"latency_ms,omitempty"`   // preferred going forward
   ChatHistory *[]BifrostMessage  `json:"chat_history,omitempty"`
   BilledUsage *BilledLLMUsage    `json:"billed_usage,omitempty"`
   ChunkIndex  int                `json:"chunk_index"`
   RawResponse interface{}        `json:"raw_response,omitempty"`
   CacheDebug  *BifrostCacheDebug `json:"cache_debug,omitempty"`
 }
Option B (revert to preserve BC):
-  Latency     int64              `json:"latency,omitempty"` // in milliseconds
+  Latency     *float64           `json:"latency,omitempty"` // in milliseconds
If you pick A, please populate both fields for one release cycle; if B, update providers to assign a float64 pointer.
core/providers/utils.go (2)
138-178: Critical: data race/use‑after‑free on ctx cancellation in makeRequestWithContext

Early return on ctx.Done() while client.Do is still running in a goroutine can race with callers’ deferred fasthttp.ReleaseRequest/ReleaseResponse, leading to use‑after‑free and memory corruption. Call sites acquire resp/req and defer release before invoking this function; returning early frees the objects while the goroutine continues writing into them.

Fix by avoiding the background goroutine and using DoDeadline (respecting ctx.Deadline) so the call remains synchronous with well-defined object lifetime.

Apply this diff:
 func makeRequestWithContext(ctx context.Context, client *fasthttp.Client, req *fasthttp.Request, resp *fasthttp.Response) (time.Duration, *schemas.BifrostError) {
-	startTime := time.Now()
-	errChan := make(chan error, 1)
-
-	go func() {
-		// client.Do is a blocking call.
-		// It will send an error (or nil for success) to errChan when it completes.
-		errChan <- client.Do(req, resp)
-	}()
-
-	select {
-	case <-ctx.Done():
-		// Context was cancelled (e.g., deadline exceeded or manual cancellation).
-		// Calculate latency even for cancelled requests
-		latency := time.Since(startTime)
-		return latency, &schemas.BifrostError{
-			IsBifrostError: true,
-			Error: schemas.ErrorField{
-				Type:    Ptr(schemas.RequestCancelled),
-				Message: fmt.Sprintf("Request cancelled or timed out by context: %v", ctx.Err()),
-				Error:   ctx.Err(),
-			},
-		}
-	case err := <-errChan:
-		// The fasthttp.Do call completed.
-		// Calculate latency for both successful and failed requests
-		latency := time.Since(startTime)
-		if err != nil {
-			// The HTTP request itself failed (e.g., connection error, fasthttp timeout).
-			return latency, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: schemas.ErrorField{
-					Message: schemas.ErrProviderRequest,
-					Error:   err,
-				},
-			}
-		}
-		// HTTP request was successful from fasthttp's perspective (err is nil).
-		// The caller should check resp.StatusCode() for HTTP-level errors (4xx, 5xx).
-		return latency, nil
-	}
+	startTime := time.Now()
+	var err error
+	if deadline, ok := ctx.Deadline(); ok {
+		err = client.DoDeadline(req, resp, deadline)
+	} else {
+		err = client.Do(req, resp)
+	}
+	latency := time.Since(startTime)
+	if err != nil {
+		return latency, &schemas.BifrostError{
+			IsBifrostError: false,
+			Error: schemas.ErrorField{
+				Message: schemas.ErrProviderRequest,
+				Error:   err,
+			},
+		}
+	}
+	return latency, nil
 }
Optional alternative (if early-return on cancel is required): allocate/copy local fasthttp.Request/Response inside this function and never touch caller-owned req/resp from the goroutine, then switch to DoDeadline for safety. That change is larger; the above is the simplest safe fix.

138-178: Critical: don't release fasthttp Request/Response while client.Do may still be using them

makeRequestWithContext spawns client.Do(req, resp) in a goroutine and may return on ctx.Done while that goroutine is still running. Call sites AcquireRequest()/AcquireResponse() then defer ReleaseRequest()/ReleaseResponse() before calling it — if ctx cancels, the caller's deferred releases can free pooled objects still in use -> data race / use-after-free.

Fix options (choose one):

Make makeRequestWithContext wait for the client.Do goroutine to finish before returning on ctx.Done (read errChan even after ctx.Done) so call-site defers are safe.

OR transfer ownership: have makeRequestWithContext own and Release the req/resp after client.Do completes and remove call-site defers (update all providers).

Examples of the affected pattern: core/providers/mistral.go:112–115, core/providers/openai.go:128–131, core/providers/azure.go:175–178 (pattern appears across many providers) — do not leave Acquire/Release paired around an async call that can outlive the caller.
core/providers/azure.go (1)

164-172: Update default Azure OpenAI API version
File: core/providers/azure.go (lines 164–172) — default "2024-02-01" is outdated; change to "2024-10-21" or switch to the /openai/v1 GA endpoints (no api-version) to ensure compatibility.
core/bifrost.go (4)
127-138: Fix: for range <int> doesn’t compile in Go.

Use a counted loop for pool prewarm and worker spin‑up.

Apply these diffs:
- for range config.InitialPoolSize {
+ for i := 0; i < config.InitialPoolSize; i++ {
- for range providerConfig.ConcurrencyAndBufferSize.Concurrency {
+ for i := 0; i < providerConfig.ConcurrencyAndBufferSize.Concurrency; i++ {
- for range providerConfig.ConcurrencyAndBufferSize.Concurrency {
+ for i := 0; i < providerConfig.ConcurrencyAndBufferSize.Concurrency; i++ {
Also applies to: 422-427, 733-738

1227-1231: Bug: non‑streaming path never retries.

Unconditional break prevents retries for retriable server/network errors.

Apply this diff to align with the streaming path:
-      result, bifrostError = handleProviderRequest(provider, req, key, req.Type)
-      if bifrostError != nil {
-        break // Don't retry client errors
-      }
+      result, bifrostError = handleProviderRequest(provider, req, key, req.Type)
+      if bifrostError != nil && !bifrostError.IsBifrostError {
+        break // Client error: don't retry
+      }
394-399: Risk: send on closed channel during concurrency update.

Closing oldQueue before swapping the map entry can panic senders that still hold oldQueue. Swap first, then close.

Apply this minimal, safer ordering:
- // Step 3: Close the old queue to signal workers to stop
- close(oldQueue)
-
- // Step 4: Atomically replace the queue
- bifrost.requestQueues.Store(providerKey, newQueue)
+ // Step 3: Atomically replace the queue so new requests use the new queue
+ bifrost.requestQueues.Store(providerKey, newQueue)
+
+ // Step 4: Close the old queue to signal workers to stop
+ close(oldQueue)
Follow‑up: consider counting/migrating late senders or gating sends via an indirection (e.g., a thin queue wrapper) to fully eliminate this class of race.

1537-1559: Harden weighted selection: zero/negative/very small weights can panic.

Intn(totalWeight) panics if totalWeight <= 0, and int(weight*100) can round small weights to 0.

Apply this robust, float‑based selector:
-func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
-  // Use a weighted random selection based on key weights
-  totalWeight := 0
-  for _, key := range keys {
-    totalWeight += int(key.Weight * 100) // Convert float to int for better performance
-  }
-  // Use a fast random number generator
-  randomSource := rand.New(rand.NewSource(time.Now().UnixNano()))
-  randomValue := randomSource.Intn(totalWeight)
-  // Select key based on weight
-  currentWeight := 0
-  for _, key := range keys {
-    currentWeight += int(key.Weight * 100)
-    if randomValue < currentWeight {
-      return key, nil
-    }
-  }
-  // Fallback to first key if something goes wrong
-  return keys[0], nil
-}
+func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
+  // Treat non‑positive weights as 1 to avoid starvation and panics.
+  total := 0.0
+  for _, k := range keys {
+    w := k.Weight
+    if w <= 0 {
+      w = 1
+    }
+    total += w
+  }
+  rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
+  if total <= 0 {
+    return keys[rnd.Intn(len(keys))], nil // uniform fallback
+  }
+  r := rnd.Float64() * total
+  acc := 0.0
+  for _, k := range keys {
+    w := k.Weight
+    if w <= 0 {
+      w = 1
+    }
+    acc += w
+    if r < acc {
+      return k, nil
+    }
+  }
+  return keys[len(keys)-1], nil
+}

🧹 Nitpick comments (12)

core/schemas/bifrost.go (1)

15-16: Public KeySelector API: clarify contract (empty keys, determinism) and doc comment

Please add a brief Godoc on KeySelector covering:

Behavior when keys is empty (must return error).

Whether selection should be deterministic given same inputs (recommended for reproducibility).
Also confirm we consistently pass a non-nil ctx pointer at call sites.
core/providers/ollama.go (1)
159-162: Align assignment with BC if Latency remains a pointer

If you keep Latency as a pointer float64 (BC), update assignment:
- response.ExtraFields.Latency = latency.Milliseconds()
+ latMs := float64(latency.Milliseconds())
+ response.ExtraFields.Latency = &latMs
transports/bifrost-http/main.go (1)
416-416: Reduce log verbosity for plugin init

These are Info-level and will print on every start; consider Debug to avoid noisy logs in prod:
- logger.Info("initializing plugin: %s", plugin.Name)
+ logger.Debug("initializing plugin: %s", plugin.Name)

- logger.Info("successfully initialized maxim plugin")
+ logger.Debug("successfully initialized maxim plugin")
Also applies to: 432-433
core/providers/groq.go (2)
120-131: Guard Authorization header to avoid sending empty bearer

If key.Value can be empty, skip the header to avoid “Bearer ”:
- req.Header.Set("Authorization", "Bearer "+key.Value)
+ if key.Value != "" {
+   req.Header.Set("Authorization", "Bearer "+key.Value)
+ }
156-159: Align latency assignment with pointer-based field (if reverting for BC)
- response.ExtraFields.Latency = latency.Milliseconds()
+ latMs := float64(latency.Milliseconds())
+ response.ExtraFields.Latency = &latMs
core/providers/sgl.go (1)
165-167: Align latency assignment with pointer-based field (if reverting for BC)
- response.ExtraFields.Latency = latency.Milliseconds()
+ latMs := float64(latency.Milliseconds())
+ response.ExtraFields.Latency = &latMs
core/providers/utils.go (1)

133-139: Update comment to reflect new synchronous behavior

After removing the goroutine, this function no longer “stops waiting and returns early on ctx cancellation.” It should instead document that it respects ctx.Deadline via DoDeadline and does not preemptively return on cancel without a deadline.
core/providers/mistral.go (1)
138-141: Avoid shadowing bifrostErr in error paths.

The short variable declaration bifrostErr := ... shadows the outer bifrostErr, which is easy to misread.
-        bifrostErr := handleProviderAPIError(resp, &errorResp)
+        apiErr := handleProviderAPIError(resp, &errorResp)
-        bifrostErr.Error.Message = fmt.Sprintf("Mistral error: %v", errorResp)
-        return nil, bifrostErr
+        apiErr.Error.Message = fmt.Sprintf("Mistral error: %v", errorResp)
+        return nil, apiErr
Apply the same rename in the Embedding block.

Also applies to: 235-238
core/providers/bedrock.go (1)
1028-1034: Use url.PathEscape instead of url.QueryEscape for path segments.

QueryEscape is intended for query strings and encodes spaces as '+'. For path components (ARN/inference profile), PathEscape is a better fit and matches usage elsewhere (e.g., embeddings/stream).
-encodedModelIdentifier := url.QueryEscape(fmt.Sprintf("%s/%s", *key.BedrockKeyConfig.ARN, inferenceProfileId))
+encodedModelIdentifier := url.PathEscape(fmt.Sprintf("%s/%s", *key.BedrockKeyConfig.ARN, inferenceProfileId))
core/bifrost.go (3)
1450-1471: Avoid retaining request context between pooled messages.

releaseChannelMessage doesn’t clear msg.Context or the request payload, risking accidental retention.

Apply:
 func (bifrost *Bifrost) releaseChannelMessage(msg *ChannelMessage) {
   // Put channels back in pools
   bifrost.responseChannelPool.Put(msg.Response)
   bifrost.errorChannelPool.Put(msg.Err)
@@
   if msg.ResponseStream != nil {
@@
   }
 
   // Clear references and return to pool
+  msg.Context = nil
+  msg.BifrostRequest = schemas.BifrostRequest{}
   msg.Response = nil
   msg.ResponseStream = nil
   msg.Err = nil
   bifrost.channelMessagePool.Put(msg)
 }
1484-1523: Key selection + context propagation LGTM, with one ask.

Selector integration and storing BifrostContextKeySelectedKey in context look correct. Please document the context key contract in schemas for plugin/consumer use.

Also applies to: 1528-1535, 1186-1187

316-316: Logger interface supports varargs — optional cleanup recommended. Logger methods are defined as (msg string, args ...any) so fmt.Sprintf(...) is redundant (harmless); prefer using logger.("format %v", args...) for consistency (e.g. core/bifrost.go:316, 332-333, 391-392, 417-421, 442).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07d817e and 34c87f1.

📒 Files selected for processing (17)

core/bifrost.go (20 hunks)
core/providers/anthropic.go (7 hunks)
core/providers/azure.go (10 hunks)
core/providers/bedrock.go (16 hunks)
core/providers/cerebras.go (4 hunks)
core/providers/cohere.go (4 hunks)
core/providers/gemini.go (4 hunks)
core/providers/groq.go (2 hunks)
core/providers/mistral.go (4 hunks)
core/providers/ollama.go (2 hunks)
core/providers/openai.go (8 hunks)
core/providers/openrouter.go (4 hunks)
core/providers/parasail.go (2 hunks)
core/providers/sgl.go (2 hunks)
core/providers/utils.go (5 hunks)
core/schemas/bifrost.go (5 hunks)
transports/bifrost-http/main.go (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (15)

core/providers/ollama.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/parasail.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/utils.go (1)

core/schemas/bifrost.go (1)

BifrostError (773-782)

core/providers/gemini.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/mistral.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/openrouter.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/cerebras.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/azure.go (3)

core/schemas/account.go (2)

Key (8-16)

AzureKeyConfig (20-24)

core/schemas/bifrost.go (2)

BifrostError (773-782)

Azure (46-46)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (24-24)

core/providers/cohere.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/groq.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/sgl.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/schemas/bifrost.go (2)

core/schemas/account.go (2)

Key (8-16)

Account (47-63)

core/schemas/plugin.go (1)

Plugin (43-63)

core/bifrost.go (3)

core/schemas/account.go (2)

Account (47-63)

Key (8-16)

core/schemas/bifrost.go (3)

KeySelector (15-15)

ModelProvider (42-42)

BifrostContextKeySelectedKey (108-108)

core/schemas/provider.go (3)

ConcurrencyAndBufferSize (51-54)

Provider (202-221)

ProviderConfig (143-151)

core/providers/bedrock.go (2)

core/schemas/account.go (1)

BedrockKeyConfig (36-43)

core/schemas/bifrost.go (2)

BifrostError (773-782)

ErrorField (790-797)

core/providers/anthropic.go (2)

core/schemas/bifrost.go (1)

BifrostError (773-782)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (24-24)

🔇 Additional comments (16)

core/schemas/bifrost.go (2)

24-28: Config wiring looks good

KeySelector on BifrostConfig is clear and aligns with the defaulting to WeightedRandomKeySelector. LGTM.

107-113: Selected-key context key: verify propagation across all paths

Storing only the Key.ID (not the secret) is good. Please verify this is set for both streaming and non‑streaming flows and is observable in hooks/telemetry as intended.

core/providers/parasail.go (1)

128-128: Latency capture and propagation — LGTM

Capturing the duration from makeRequestWithContext and surfacing it via ExtraFields.Latency is correct and consistent with the new schema.

Also applies to: 158-158

core/providers/cohere.go (1)

246-246: Latency propagation — LGTM

Both ChatCompletion and Embedding correctly capture latency and expose it as milliseconds in ExtraFields.

Also applies to: 336-336, 666-666, 717-717

core/providers/cerebras.go (1)

149-149: Latency propagation — LGTM

Both text and chat paths correctly capture latency and expose it as milliseconds in ExtraFields.

Also applies to: 212-213, 289-289

core/providers/openrouter.go (1)

145-145: Latency propagation — LGTM

Text and chat paths correctly capture and surface latency in ExtraFields.

Also applies to: 196-196, 244-244, 269-269

core/providers/openai.go (1)

145-145: Latency propagation — LGTM

ChatCompletion, Embedding, Speech, and Transcription correctly capture latency and set ExtraFields.Latency (ms). Consistent with schema changes.

Also applies to: 178-179, 308-308, 616-617, 830-830, 865-865

core/providers/mistral.go (1)

128-132: Latency propagation looks correct.

makeRequestWithContext now returns latency and it’s stored in ExtraFields.Latency (ms). Matches the pattern used across providers.

Also applies to: 157-158

core/providers/gemini.go (2)

171-175: Latency capture and exposure: LGTM.

latency from makeRequestWithContext is written to ExtraFields.Latency (ms). Implementation is consistent with other providers.

Also applies to: 209-211

1090-1094: Good: completeRequest returns latency and sets it on the base response.

Creates a single place to standardize latency for Gemini non-streaming calls. Looks solid.

Also applies to: 1112-1115

core/providers/bedrock.go (2)

262-263: Latency measurement integrated correctly.

completeRequest now returns (body, latency). Callers attach latency to responses. No correctness issues spotted.

Also applies to: 309-313, 358-359

839-851: Latency attached to responses: LGTM.

Non‑streaming Text/Chat and Titan/Cohere embedding paths consistently set ExtraFields.Latency (ms).

Also applies to: 1123-1125, 1296-1299, 1364-1367

core/providers/anthropic.go (1)

386-387: Latency propagation in Text/Chat: LGTM.

ExtraFields.Latency is set from the measured duration in both paths.

Also applies to: 441-442

core/providers/azure.go (1)

233-234: Latency wiring across Text/Chat/Embedding: LGTM.

latency is captured and ExtraFields.Latency is set consistently.

Also applies to: 322-324, 377-379

core/bifrost.go (2)

721-739: Provider queue initialization LGTM.

Channel typing to chan *ChannelMessage and worker startup wiring are sound (post counted‑loop fix).

1001-1019: Enqueue path LGTM.

Non‑blocking fast path, cancellation handling, and drop‑on‑full behavior are well balanced.

Also applies to: 1119-1137

core/providers/anthropic.go

core/providers/azure.go

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/bifrost.go (1)

1938-1960: Fix potential panic and improve random number generation.

The implementation has several issues:

Critical: If all keys have weights that round to 0 (e.g., all weights < 0.01), totalWeight will be 0, causing rand.Intn(0) to panic on line 1947.
Major: Creating a new rand.New() on each call (line 1946) is inefficient and can produce predictable patterns when multiple requests arrive within the same nanosecond.
Minor: Converting key.Weight * 100 to int loses precision for small weights (e.g., 0.001 becomes 0).

Apply this diff to address these issues:

+var (
+	// Package-level random source for key selection
+	keySelectionRand = rand.New(rand.NewSource(time.Now().UnixNano()))
+	keySelectionMu   sync.Mutex
+)
+
 func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
+	// Defensive: handle empty keys
+	if len(keys) == 0 {
+		return schemas.Key{}, fmt.Errorf("no keys provided for selection")
+	}
+
 	// Use a weighted random selection based on key weights
-	totalWeight := 0
+	totalWeight := 0.0
 	for _, key := range keys {
-		totalWeight += int(key.Weight * 100) // Convert float to int for better performance
+		if key.Weight < 0 {
+			return schemas.Key{}, fmt.Errorf("negative weight not allowed: %f", key.Weight)
+		}
+		totalWeight += key.Weight
 	}
 
-	// Use a fast random number generator
-	randomSource := rand.New(rand.NewSource(time.Now().UnixNano()))
-	randomValue := randomSource.Intn(totalWeight)
+	// If all weights are 0 or very small, use equal probability
+	if totalWeight < 0.0001 {
+		keySelectionMu.Lock()
+		selectedIndex := keySelectionRand.Intn(len(keys))
+		keySelectionMu.Unlock()
+		return keys[selectedIndex], nil
+	}
+
+	// Use package-level random source with mutex for thread safety
+	keySelectionMu.Lock()
+	randomValue := keySelectionRand.Float64() * totalWeight
+	keySelectionMu.Unlock()
 
 	// Select key based on weight
-	currentWeight := 0
+	currentWeight := 0.0
 	for _, key := range keys {
-		currentWeight += int(key.Weight * 100)
+		currentWeight += key.Weight
 		if randomValue < currentWeight {
 			return key, nil
 		}
 	}
 
 	// Fallback to first key if something goes wrong
 	return keys[0], nil
 }

♻️ Duplicate comments (2)

core/providers/anthropic.go (1)
170-174: Copy the response body before releasing fasthttp.Response.

Line 173 returns the slice from resp.Body(), but the deferred fasthttp.ReleaseResponse(resp) frees that buffer before the caller can use it, leading to use-after-free. Copy into a new slice before returning.
-	// Read the response body
-	body := resp.Body()
-
-	return body, latency, nil
+	// Read the response body
+	body := append([]byte(nil), resp.Body()...)
+
+	return body, latency, nil
core/providers/azure.go (1)
139-143: Avoid returning resp.Body() after the response is released.

Line 142 hands back the fasthttp buffer directly; once the deferred fasthttp.ReleaseResponse(resp) runs, callers see corrupted data. Copy the body before returning.
-	// Read the response body
-	body := resp.Body()
-
-	return body, latency, nil
+	// Read the response body
+	body := append([]byte(nil), resp.Body()...)
+
+	return body, latency, nil

🧹 Nitpick comments (1)

framework/streaming/audio.go (1)
70-70: Prefer Duration.Milliseconds() for readability

The built-in helper conveys intent and keeps precision handling in one place. Please mirror the same tweak in the transcription/chat accumulators for consistency.

Apply this diff:
-		data.Latency = accumulator.FinalTimestamp.Sub(accumulator.StartTimestamp).Nanoseconds() / 1e6
+		data.Latency = accumulator.FinalTimestamp.Sub(accumulator.StartTimestamp).Milliseconds()

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34c87f1 and d80798a.

📒 Files selected for processing (14)

core/bifrost.go (20 hunks)
core/providers/anthropic.go (9 hunks)
core/providers/azure.go (10 hunks)
core/providers/bedrock.go (18 hunks)
core/providers/cohere.go (11 hunks)
core/providers/gemini.go (12 hunks)
core/providers/openai.go (12 hunks)
core/providers/utils.go (6 hunks)
core/schemas/bifrost.go (5 hunks)
core/schemas/providers/bedrock/chat.go (0 hunks)
framework/streaming/audio.go (1 hunks)
framework/streaming/chat.go (1 hunks)
framework/streaming/transcription.go (1 hunks)
framework/streaming/types.go (2 hunks)

💤 Files with no reviewable changes (1)

core/schemas/providers/bedrock/chat.go

🧰 Additional context used

🧬 Code graph analysis (11)

core/providers/anthropic.go (2)

core/schemas/bifrost.go (1)

BifrostError (450-459)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

core/schemas/bifrost.go (1)

core/schemas/account.go (1)

Key (8-17)

core/providers/openai.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

framework/streaming/types.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

framework/streaming/audio.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

framework/streaming/transcription.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/azure.go (2)

core/schemas/account.go (2)

Key (8-17)

AzureKeyConfig (27-31)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

core/bifrost.go (3)

core/schemas/bifrost.go (3)

KeySelector (15-15)

ModelProvider (46-46)

BifrostContextKeySelectedKey (118-118)

core/schemas/provider.go (1)

Provider (194-219)

core/schemas/account.go (1)

Key (8-17)

core/providers/utils.go (2)

core/schemas/bifrost.go (1)

BifrostError (450-459)

core/schemas/provider.go (1)

ErrProviderRequestTimedOut (22-22)

framework/streaming/chat.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/gemini.go (2)

plugins/mocker/main.go (1)

Latency (125-129)

core/schemas/bifrost.go (1)

BifrostError (450-459)

🔇 Additional comments (5)

core/bifrost.go (5)

49-49: LGTM: KeySelector field integration.

The keySelector field is properly initialized with a sensible default (WeightedRandomKeySelector) when not provided in the config. This enables custom key selection strategies while maintaining backward compatibility.

Also applies to: 90-90, 95-97

1876-1936: LGTM: Pluggable key selector integration.

The key selection logic correctly delegates to the configurable keySelector when multiple keys support the model, enabling custom selection strategies (round-robin, least-used, etc.). Error handling is appropriate.

1539-1539: LGTM: Selected key tracking via context.

Storing the selected key ID in the request context enables downstream tracking and debugging of which API key was used for each request.

635-635: LGTM: Pointer-based queue refactor.

The refactor from chan ChannelMessage to chan *ChannelMessage is consistently applied across all queue operations. This reduces memory allocations and improves performance by avoiding value copies when sending messages through channels.

Also applies to: 640-640, 656-656, 1020-1020, 1047-1047, 1053-1053, 1066-1066, 1082-1082, 1345-1345, 1357-1357, 1461-1461, 1473-1473, 1504-1504, 1969-1969

1370-1370: LGTM: Context pointer passed to PostHooks.

Correctly passing &msg.Context to RunPostHooks as the function expects a context pointer, allowing plugins to potentially modify the context during post-processing.

Also applies to: 1379-1379

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/bifrost.go (1)

1939-1960: Guard against zero/empty weights in WeightedRandomKeySelector.

rand.Intn(totalWeight) panics when totalWeight == 0. This happens if all keys have weight 0 (default) or weights round down after int(key.Weight * 100). Add a fallback (e.g., treat equal weights when totalWeight <= 0) before calling Intn.

♻️ Duplicate comments (1)

core/providers/azure.go (1)
139-143: Copy the fasthttp response body before releasing it.

resp.Body() aliases fasthttp's internal buffer, but defer fasthttp.ReleaseResponse(resp) frees that storage when completeRequest returns. Callers then read from memory that fasthttp may reuse, causing a use-after-free. Copy the bytes before returning.
-	// Read the response body
-	body := resp.Body()
-
-	return body, latency, nil
+	// Read the response body
+	body := append([]byte(nil), resp.Body()...)
+
+	return body, latency, nil

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d80798a and e13e8a7.

📒 Files selected for processing (15)

core/bifrost.go (20 hunks)
core/providers/anthropic.go (9 hunks)
core/providers/azure.go (10 hunks)
core/providers/bedrock.go (18 hunks)
core/providers/cohere.go (11 hunks)
core/providers/gemini.go (12 hunks)
core/providers/openai.go (12 hunks)
core/providers/utils.go (6 hunks)
core/schemas/bifrost.go (5 hunks)
core/schemas/providers/bedrock/chat.go (0 hunks)
framework/streaming/audio.go (1 hunks)
framework/streaming/chat.go (1 hunks)
framework/streaming/transcription.go (1 hunks)
framework/streaming/types.go (2 hunks)
transports/bifrost-http/main.go (1 hunks)

💤 Files with no reviewable changes (1)

core/schemas/providers/bedrock/chat.go

🚧 Files skipped from review as they are similar to previous changes (4)

core/schemas/bifrost.go
framework/streaming/chat.go
core/providers/openai.go
core/providers/cohere.go

🧰 Additional context used

🧬 Code graph analysis (8)

framework/streaming/audio.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/utils.go (1)

core/schemas/bifrost.go (1)

BifrostError (450-459)

core/providers/anthropic.go (2)

core/schemas/bifrost.go (1)

BifrostError (450-459)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

framework/streaming/types.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/bedrock.go (2)

core/schemas/account.go (1)

Key (8-17)

core/schemas/bifrost.go (2)

BifrostError (450-459)

ErrorField (467-474)

core/providers/azure.go (1)

core/schemas/account.go (2)

Key (8-17)

AzureKeyConfig (27-31)

framework/streaming/transcription.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/bifrost.go (2)

core/schemas/bifrost.go (3)

KeySelector (15-15)

ModelProvider (46-46)

BifrostContextKeySelectedKey (118-118)

core/schemas/account.go (1)

Key (8-17)

core/providers/anthropic.go

core/providers/gemini.go

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

core/providers/azure.go (1)
139-143: Copy the response body before releasing fasthttp buffers.

resp.Body() aliases fasthttp’s internal buffer. Once the deferred fasthttp.ReleaseResponse(resp) runs, that backing storage is returned to the pool, so callers read freed memory. Copy the bytes before returning. Same bug as flagged previously.
-	// Read the response body
-	body := resp.Body()
-
-	return body, latency, nil
+	// Read the response body
+	body := append([]byte(nil), resp.Body()...)
+
+	return body, latency, nil
core/providers/gemini.go (1)
835-846: Clone the fasthttp body before releasing the response.

resp.Body() points to fasthttp's pooled buffer. Once the deferred fasthttp.ReleaseResponse(resp) at line 810 executes, the returned slice becomes invalid. Copy the bytes prior to using them in unmarshal calls and returning rawResponse.

Apply this diff to clone the body:
-	responseBody := resp.Body()
+	responseBody := append([]byte(nil), resp.Body()...)

🧹 Nitpick comments (1)

core/bifrost.go (1)

1944-1966: Consider reusing a rand source or using math/rand/v2 for better performance.

The WeightedRandomKeySelector creates a new rand.New source on every invocation. While this works correctly and avoids concurrency issues, it incurs unnecessary allocation overhead. Consider using a shared concurrent-safe random source (e.g., a sync.Pool of *rand.Rand instances) or migrating to math/rand/v2 which provides concurrent-safe global functions.

Example using math/rand/v2 (if available):

import (
	cryptorand "crypto/rand"
	"encoding/binary"
	"math/rand/v2"
)

func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
	totalWeight := 0
	for _, key := range keys {
		totalWeight += int(key.Weight * 100)
	}

	randomValue := rand.IntN(totalWeight)

	currentWeight := 0
	for _, key := range keys {
		currentWeight += int(key.Weight * 100)
		if randomValue < currentWeight {
			return key, nil
		}
	}

	return keys[0], nil
}

Alternatively, use a sync.Pool for rand sources:

var randPool = sync.Pool{
	New: func() interface{} {
		var seed int64
		binary.Read(cryptorand.Reader, binary.LittleEndian, &seed)
		return rand.New(rand.NewSource(seed))
	},
}

func WeightedRandomKeySelector(ctx *context.Context, keys []schemas.Key, providerKey schemas.ModelProvider, model string) (schemas.Key, error) {
	totalWeight := 0
	for _, key := range keys {
		totalWeight += int(key.Weight * 100)
	}

	r := randPool.Get().(*rand.Rand)
	randomValue := r.Intn(totalWeight)
	randPool.Put(r)

	currentWeight := 0
	for _, key := range keys {
		currentWeight += int(key.Weight * 100)
		if randomValue < currentWeight {
			return key, nil
		}
	}

	return keys[0], nil
}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e13e8a7 and c511c26.

📒 Files selected for processing (15)

core/bifrost.go (20 hunks)
core/providers/anthropic.go (9 hunks)
core/providers/azure.go (10 hunks)
core/providers/bedrock.go (18 hunks)
core/providers/cohere.go (11 hunks)
core/providers/gemini.go (12 hunks)
core/providers/openai.go (12 hunks)
core/providers/utils.go (6 hunks)
core/schemas/bifrost.go (5 hunks)
core/schemas/providers/bedrock/chat.go (0 hunks)
framework/streaming/audio.go (1 hunks)
framework/streaming/chat.go (1 hunks)
framework/streaming/transcription.go (1 hunks)
framework/streaming/types.go (2 hunks)
transports/bifrost-http/main.go (1 hunks)

💤 Files with no reviewable changes (1)

core/schemas/providers/bedrock/chat.go

✅ Files skipped from review due to trivial changes (1)

transports/bifrost-http/main.go

🚧 Files skipped from review as they are similar to previous changes (4)

core/providers/openai.go
framework/streaming/transcription.go
core/providers/anthropic.go
core/providers/cohere.go

🧰 Additional context used

🧬 Code graph analysis (9)

framework/streaming/audio.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/azure.go (3)

core/schemas/account.go (2)

Key (8-17)

AzureKeyConfig (27-31)

core/schemas/bifrost.go (2)

BifrostError (536-545)

Azure (50-50)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

core/providers/bedrock.go (3)

core/schemas/account.go (1)

Key (8-17)

core/schemas/bifrost.go (2)

BifrostError (536-545)

ErrorField (553-560)

core/schemas/provider.go (1)

ErrProviderRequestTimedOut (22-22)

core/schemas/bifrost.go (2)

core/schemas/account.go (1)

Key (8-17)

ui/lib/types/config.ts (1)

ModelProvider (165-167)

core/providers/utils.go (2)

core/schemas/bifrost.go (1)

BifrostError (536-545)

core/schemas/provider.go (1)

ErrProviderRequestTimedOut (22-22)

core/bifrost.go (2)

core/schemas/bifrost.go (3)

KeySelector (15-15)

ModelProvider (46-46)

BifrostContextKeySelectedKey (118-118)

core/schemas/account.go (1)

Key (8-17)

core/providers/gemini.go (2)

plugins/mocker/main.go (1)

Latency (125-129)

core/schemas/bifrost.go (1)

BifrostError (536-545)

framework/streaming/types.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

framework/streaming/chat.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

🔇 Additional comments (5)

core/providers/gemini.go (2)

122-122: LGTM!

The latency tracking integration is correct. The makeRequestWithContext call now returns latency, which is properly captured and used downstream.

160-160: LGTM!

Latency is correctly converted to milliseconds and stored in ExtraFields.Latency for non-streaming chat completion responses.

core/bifrost.go (3)

90-97: LGTM!

The keySelector field is properly initialized from the config with a sensible default fallback to WeightedRandomKeySelector when not provided. This provides a good balance between flexibility and usability.

641-646: LGTM!

The refactor from chan ChannelMessage to chan *ChannelMessage is applied consistently across queue types, function signatures, type assertions, and operations. This change reduces memory allocation overhead for message passing.

Also applies to: 1026-1026, 1053-1053, 1510-1510, 1975-1975

1545-1545: LGTM!

Storing the selected key ID in the request context enables tracking and debugging of which key was used for each request. This is a valuable addition for observability.

akshaydeo · 2025-10-09T11:24:48Z

Merge activity

Oct 9, 11:24 AM UTC: A user started a stack merge that includes this pull request via Graphite.
Oct 9, 11:25 AM UTC: @akshaydeo merged this pull request with Graphite.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/providers/openai.go (1)
988-1005: Copy speech audio before releasing fasthttp.Response.

fasthttp.Response.Body() points to an internal buffer that is recycled once fasthttp.ReleaseResponse(resp) runs (triggered by the deferred release at function exit). Returning that slice leads to use-after-free/corrupted audio. Copy the bytes before releasing the response.
-	audioData := resp.Body()
+	audioData := append([]byte(nil), resp.Body()...)

🧹 Nitpick comments (1)

core/bifrost.go (1)
1944-1966: Inefficient random number generation on every call.

Creating a new rand.New(rand.NewSource(time.Now().UnixNano())) on each call is inefficient and can produce poor randomness when called rapidly (same nanosecond). Consider using a package-level random generator or the global rand functions.

Apply this diff to use the global random generator:
-	// Use a fast random number generator
-	randomSource := rand.New(rand.NewSource(time.Now().UnixNano()))
-	randomValue := randomSource.Intn(totalWeight)
+	// Use the global random number generator
+	randomValue := rand.Intn(totalWeight)
Note: The global rand package is already seeded by default in Go 1.20+. For earlier versions, you can seed it once during package initialization.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebd668c and 555c4fc.

📒 Files selected for processing (23)

core/bifrost.go (20 hunks)
core/providers/anthropic.go (9 hunks)
core/providers/azure.go (10 hunks)
core/providers/bedrock.go (18 hunks)
core/providers/cohere.go (11 hunks)
core/providers/gemini.go (12 hunks)
core/providers/openai.go (12 hunks)
core/providers/utils.go (6 hunks)
core/schemas/bifrost.go (5 hunks)
core/schemas/providers/bedrock/chat.go (0 hunks)
framework/streaming/audio.go (1 hunks)
framework/streaming/chat.go (1 hunks)
framework/streaming/transcription.go (1 hunks)
framework/streaming/types.go (2 hunks)
tests/core-providers/custom_test.go (1 hunks)
tests/core-providers/scenarios/chat_completion_stream.go (1 hunks)
tests/core-providers/scenarios/complete_end_to_end.go (1 hunks)
tests/core-providers/scenarios/cross_provider_scenarios.go (1 hunks)
tests/core-providers/scenarios/end_to_end_tool_calling.go (1 hunks)
tests/core-providers/scenarios/multi_turn_conversation.go (1 hunks)
tests/core-providers/scenarios/multiple_images.go (1 hunks)
tests/core-providers/scenarios/utils.go (5 hunks)
transports/bifrost-http/main.go (1 hunks)

💤 Files with no reviewable changes (1)

core/schemas/providers/bedrock/chat.go

✅ Files skipped from review due to trivial changes (1)

transports/bifrost-http/main.go

🚧 Files skipped from review as they are similar to previous changes (5)

framework/streaming/transcription.go
core/providers/cohere.go
core/schemas/bifrost.go
framework/streaming/chat.go
framework/streaming/audio.go

🧰 Additional context used

🧬 Code graph analysis (9)

framework/streaming/types.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

tests/core-providers/scenarios/utils.go (1)

ui/lib/types/logs.ts (1)

ChatMessageContent (111-111)

core/providers/bedrock.go (2)

core/schemas/bifrost.go (2)

BifrostError (536-545)

ErrorField (553-560)

core/schemas/provider.go (1)

ErrProviderRequestTimedOut (22-22)

core/providers/azure.go (3)

core/schemas/account.go (2)

Key (8-17)

AzureKeyConfig (27-31)

core/schemas/bifrost.go (2)

BifrostError (536-545)

Azure (50-50)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

core/providers/anthropic.go (2)

core/schemas/bifrost.go (1)

BifrostError (536-545)

core/schemas/provider.go (1)

ErrProviderJSONMarshaling (26-26)

core/providers/openai.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/providers/utils.go (2)

core/schemas/bifrost.go (1)

BifrostError (536-545)

core/schemas/provider.go (1)

ErrProviderRequestTimedOut (22-22)

core/providers/gemini.go (1)

plugins/mocker/main.go (1)

Latency (125-129)

core/bifrost.go (3)

core/schemas/bifrost.go (3)

KeySelector (15-15)

ModelProvider (46-46)

BifrostContextKeySelectedKey (118-118)

core/schemas/provider.go (1)

Provider (194-219)

core/schemas/account.go (1)

Key (8-17)

🔇 Additional comments (19)

tests/core-providers/scenarios/utils.go (4)

221-224: LGTM!

The migration to pointer-based Content is correct. Creating &schemas.ChatMessageContent{...} ensures the Content field is properly initialized as a pointer.

240-245: LGTM!

The pointer-based Content initialization is correct for image messages with ContentBlocks.

269-271: LGTM!

The pointer-based Content initialization is correct for tool messages.

306-344: LGTM!

The nil-safety checks are comprehensive and correctly handle the pointer-based Content field. The logic properly checks for nil before accessing ContentStr and ContentBlocks, and assembles text from blocks when needed.

tests/core-providers/scenarios/multiple_images.go (1)

31-31: LGTM!

The pointer-based Content initialization aligns with the broader migration to pointer types for message content.

tests/core-providers/scenarios/chat_completion_stream.go (1)

150-155: LGTM!

The pointer-based construction of Message and Content in the consolidated response is correct and safe. This aligns with the pointer-based content model used throughout the test suite.

tests/core-providers/custom_test.go (1)

108-110: LGTM!

The pointer-based Content initialization is consistent with the broader test suite migration to pointer types.

core/providers/azure.go (4)

65-143: LGTM! Latency tracking and use-after-free fix correctly implemented.

The function signature correctly returns latency alongside the response body, all error paths return appropriate latency values (0 for early errors, measured latency for later errors), and the response body is copied before releasing the fasthttp response to prevent use-after-free.

156-171: LGTM! Latency correctly captured and propagated.

The latency from completeRequest is correctly captured and assigned to response.ExtraFields.Latency in milliseconds.

246-263: LGTM! Latency correctly captured and propagated.

The latency from completeRequest is correctly captured and assigned to response.ExtraFields.Latency in milliseconds.

302-316: LGTM! Latency correctly captured and propagated.

The latency from completeRequest is correctly captured and assigned to response.ExtraFields.Latency in milliseconds.

core/providers/gemini.go (4)

122-160: LGTM! Latency correctly captured and propagated.

The latency from makeRequestWithContext is correctly captured and assigned to response.ExtraFields.Latency in milliseconds.

256-267: LGTM! Latency correctly captured and propagated.

The latency from completeRequest is correctly captured and assigned to bifrostResponse.ExtraFields.Latency in milliseconds.

517-528: LGTM! Latency correctly captured and propagated.

The latency from completeRequest is correctly captured and assigned to bifrostResponse.ExtraFields.Latency in milliseconds.

804-850: LGTM! Latency tracking and use-after-free fix correctly implemented.

The function correctly captures latency from makeRequestWithContext, copies the response body before releasing the fasthttp response to prevent use-after-free, and propagates latency through all return paths.

core/bifrost.go (4)

49-50: LGTM! Custom key selector correctly initialized.

The keySelector field is correctly added to the Bifrost struct and initialized from config with a sensible default to WeightedRandomKeySelector.

Also applies to: 90-97

641-641: LGTM! Queue type migration to pointers is consistent.

All queue type changes from chan ChannelMessage to chan *ChannelMessage are applied consistently across queue creation, transfers, send operations, and shutdown.

Also applies to: 646-646, 662-662, 1026-1026, 1053-1053, 1059-1059, 1072-1072, 1088-1088, 1351-1351, 1363-1363, 1467-1467, 1479-1479, 1975-1975

1545-1545: LGTM! Selected key ID correctly stored in context.

The selected key ID is correctly stored in the request context for tracking and debugging.

1882-1942: LGTM! Key selector delegation correctly implemented.

The function correctly delegates to the pluggable keySelector after filtering keys by model and deployment support.

core/providers/gemini.go

coderabbitai · 2025-10-09T11:26:22Z

tests/core-providers/scenarios/complete_end_to_end.go

 		if result1.ChatCompletionsResponse.Choices != nil {
 			for _, choice := range result1.ChatCompletionsResponse.Choices {
-				chatConversationHistory = append(chatConversationHistory, choice.Message)
+				chatConversationHistory = append(chatConversationHistory, *choice.Message)
 			}
 		}


🛠️ Refactor suggestion | 🟠 Major

Add nil check before dereferencing pointer.

Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.

Apply this diff to add nil safety:

if result1.ChatCompletionsResponse.Choices != nil { for _, choice := range result1.ChatCompletionsResponse.Choices { + if choice.Message != nil { chatConversationHistory = append(chatConversationHistory, *choice.Message) + } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if result1.ChatCompletionsResponse.Choices != nil {

for _, choice := range result1.ChatCompletionsResponse.Choices {

chatConversationHistory = append(chatConversationHistory, choice.Message)

chatConversationHistory = append(chatConversationHistory, *choice.Message)

}

}

if result1.ChatCompletionsResponse.Choices != nil {

for _, choice := range result1.ChatCompletionsResponse.Choices {

if choice.Message != nil {

chatConversationHistory = append(chatConversationHistory, *choice.Message)

}

}

}

🤖 Prompt for AI Agents

In tests/core-providers/scenarios/complete_end_to_end.go around lines 120 to 124, the loop dereferences choice.Message without checking for nil which can panic; update the loop to check that choice.Message is not nil before appending (i.e., if choice.Message != nil { chatConversationHistory = append(chatConversationHistory, *choice.Message) }) so only non-nil messages are dereferenced and added.

coderabbitai · 2025-10-09T11:26:23Z

tests/core-providers/scenarios/cross_provider_scenarios.go

 			// Use Chat API choices
 			for _, choice := range response.Choices {
-				conversationHistory = append(conversationHistory, choice.Message)
+				conversationHistory = append(conversationHistory, *choice.Message)
 			}
 		}


🛠️ Refactor suggestion | 🟠 Major

Add nil check before dereferencing pointer.

Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.

Apply this diff to add nil safety:

} else { // Use Chat API choices for _, choice := range response.Choices { + if choice.Message != nil { conversationHistory = append(conversationHistory, *choice.Message) + } } }

🤖 Prompt for AI Agents

In tests/core-providers/scenarios/cross_provider_scenarios.go around lines 697 to 701, the loop dereferences choice.Message without checking for nil which can cause a panic; update the loop to check if choice.Message != nil before appending to conversationHistory and skip (or handle) cases where Message is nil so you never dereference a nil pointer.

coderabbitai · 2025-10-09T11:26:23Z

tests/core-providers/scenarios/end_to_end_tool_calling.go

 		if result1.ChatCompletionsResponse.Choices != nil {
 			for _, choice := range result1.ChatCompletionsResponse.Choices {
-				chatConversationMessages = append(chatConversationMessages, choice.Message)
+				chatConversationMessages = append(chatConversationMessages, *choice.Message)
 			}
 		}


🛠️ Refactor suggestion | 🟠 Major

Add nil check before dereferencing pointer.

Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.

Apply this diff to add nil safety:

if result1.ChatCompletionsResponse.Choices != nil { for _, choice := range result1.ChatCompletionsResponse.Choices { + if choice.Message != nil { chatConversationMessages = append(chatConversationMessages, *choice.Message) + } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if result1.ChatCompletionsResponse.Choices != nil {

for _, choice := range result1.ChatCompletionsResponse.Choices {

chatConversationMessages = append(chatConversationMessages, choice.Message)

chatConversationMessages = append(chatConversationMessages, *choice.Message)

}

}

if result1.ChatCompletionsResponse.Choices != nil {

for _, choice := range result1.ChatCompletionsResponse.Choices {

if choice.Message != nil {

chatConversationMessages = append(chatConversationMessages, *choice.Message)

}

}

}

🤖 Prompt for AI Agents

In tests/core-providers/scenarios/end_to_end_tool_calling.go around lines 132 to 136, the loop dereferences choice.Message directly which can panic if Message is nil; add a nil check (if choice.Message != nil) before appending to chatConversationMessages so only non-nil Message pointers are dereferenced and appended.

coderabbitai · 2025-10-09T11:26:23Z

tests/core-providers/scenarios/multi_turn_conversation.go

 		if response1.Choices != nil {
 			for _, choice := range response1.Choices {
-				messages2 = append(messages2, choice.Message)
+				messages2 = append(messages2, *choice.Message)
 			}
 		}


🛠️ Refactor suggestion | 🟠 Major

Add nil check before dereferencing pointer.

Dereferencing choice.Message without a nil check could cause a panic if the Message field is nil. Add a safety check before appending.

Apply this diff to add nil safety:

if response1.Choices != nil { for _, choice := range response1.Choices { + if choice.Message != nil { messages2 = append(messages2, *choice.Message) + } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if response1.Choices != nil {

for _, choice := range response1.Choices {

messages2 = append(messages2, choice.Message)

messages2 = append(messages2, *choice.Message)

}

}

if response1.Choices != nil {

for _, choice := range response1.Choices {

if choice.Message != nil {

messages2 = append(messages2, *choice.Message)

}

}

}

🤖 Prompt for AI Agents

In tests/core-providers/scenarios/multi_turn_conversation.go around lines 75 to 79, the loop dereferences choice.Message without checking for nil which can panic; update the loop to skip nil Message pointers by adding a guard (if choice.Message == nil { continue }) before appending, so only non-nil messages are dereferenced and appended to messages2.

Pratham-Mishra04 requested review from akshaydeo and danpiths September 9, 2025 14:03

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from ccd9a7a to 07d817e Compare September 10, 2025 10:35

Pratham-Mishra04 marked this pull request as ready for review September 10, 2025 10:36

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

coderabbitai bot approved these changes Sep 10, 2025

View reviewed changes

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from 07d817e to 34c87f1 Compare September 16, 2025 07:23

Pratham-Mishra04 changed the title ~~feat: add custom key selector functionality with context tracking~~ feat: add custom key selector and request latency tracking Sep 16, 2025

Pratham-Mishra04 mentioned this pull request Sep 16, 2025

feat: add retry count tracking and metrics for Bifrost requests #472

Open

coderabbitai bot requested changes Sep 16, 2025

View reviewed changes

core/providers/anthropic.go Outdated Show resolved Hide resolved

core/providers/azure.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from 34c87f1 to d80798a Compare October 7, 2025 09:03

This was referenced Oct 7, 2025

feat: sync pools added to core for memory optimisations #572

Draft

feat: add ResponsesMessage support to logging #573

Merged

coderabbitai bot reviewed Oct 7, 2025

View reviewed changes

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from d80798a to e13e8a7 Compare October 7, 2025 14:56

coderabbitai bot requested changes Oct 7, 2025

View reviewed changes

core/providers/anthropic.go Outdated Show resolved Hide resolved

core/providers/gemini.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from e13e8a7 to c511c26 Compare October 8, 2025 14:21

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from c511c26 to ebd668c Compare October 9, 2025 11:11

coderabbitai bot approved these changes Oct 9, 2025

View reviewed changes

feat: core extended

555c4fc

Pratham-Mishra04 force-pushed the 08-30-feat_core_extended branch from ebd668c to 555c4fc Compare October 9, 2025 11:21

akshaydeo merged commit 18785e9 into main Oct 9, 2025
2 of 3 checks passed

akshaydeo deleted the 08-30-feat_core_extended branch October 9, 2025 11:25

coderabbitai bot requested changes Oct 9, 2025

View reviewed changes

TejasGhatte mentioned this pull request Oct 13, 2025

feat: added migrator to log store #612

Merged

18 tasks

coderabbitai bot mentioned this pull request Oct 15, 2025

feat: bifrost response structure seggragated #618

Merged

18 tasks

feat: add custom key selector and request latency tracking #449

feat: add custom key selector and request latency tracking #449

Uh oh!

Conversation

Pratham-Mishra04 commented Sep 9, 2025

Summary

Changes

Type of change

Affected areas

How to test

Breaking changes

Related issues

Security considerations

Checklist

Uh oh!

coderabbitai bot commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

Pratham-Mishra04 commented Sep 9, 2025 • edited by TejasGhatte Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

akshaydeo commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Sep 9, 2025 •

edited

Loading

Pratham-Mishra04 commented Sep 9, 2025 •

edited by TejasGhatte

Loading

akshaydeo commented Oct 9, 2025 •

edited

Loading