Skip to content

Conversation

@akshaydeo
Copy link
Contributor

@akshaydeo akshaydeo commented Oct 27, 2025

Summary

Refactored streaming implementations across providers to use fasthttp instead of the standard http client, eliminating the need for separate stream clients.

Solves #269

Changes

  • Replaced all http.Client instances with fasthttp.Client for streaming operations
  • Removed redundant streamClient fields from provider structs
  • Updated error handling to use fasthttp-specific error types
  • Refactored request/response handling to use fasthttp's request/response pooling
  • Fixed a typo in the schemas module name

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP)
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs

How to test

Verify that streaming operations work correctly with all providers:

# Core/Transports
go version
go test ./...

Additionally, test streaming endpoints with various providers to ensure they function properly.

Breaking changes

  • Yes
  • No

Related issues

N/A

Security considerations

No security implications as this is an internal refactoring of HTTP client usage.

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 27, 2025

Warning

Rate limit exceeded

@akshaydeo has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 2 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a21c098 and ddcfbfa.

⛔ Files ignored due to path filters (5)
  • framework/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (89)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/bifrost.go (2 hunks)
  • core/go.mod (1 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/azure/types.go (1 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/bedrock/signer_test.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (8 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (26 hunks)
  • core/providers/openrouter.go (7 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (14 hunks)
  • core/utils.go (4 hunks)
  • framework/configstore/clientconfig.go (2 hunks)
  • framework/configstore/migrations.go (4 hunks)
  • framework/configstore/rdb.go (6 hunks)
  • framework/configstore/store.go (2 hunks)
  • framework/configstore/tables/config.go (1 hunks)
  • framework/configstore/tables/sessions.go (1 hunks)
  • framework/encrypt/encrypt.go (2 hunks)
  • framework/go.mod (1 hunks)
  • framework/modelcatalog/main.go (1 hunks)
  • framework/modelcatalog/sync.go (2 hunks)
  • plugins/logging/go.mod (2 hunks)
  • plugins/otel/go.mod (1 hunks)
  • plugins/telemetry/go.mod (2 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • transports/bifrost-http/handlers/config.go (5 hunks)
  • transports/bifrost-http/handlers/middlewares.go (3 hunks)
  • transports/bifrost-http/handlers/session.go (1 hunks)
  • transports/bifrost-http/lib/config.go (6 hunks)
  • transports/bifrost-http/main.go (3 hunks)
  • transports/bifrost-http/server.go (11 hunks)
  • transports/bifrost-http/server_test.go (1 hunks)
  • transports/go.mod (0 hunks)
  • ui/app/_fallbacks/enterprise/components/api-keys/APIKeysView.tsx (1 hunks)
  • ui/app/_fallbacks/enterprise/components/login/loginView.tsx (1 hunks)
  • ui/app/clientLayout.tsx (1 hunks)
  • ui/app/guardrails/configuration/page.tsx (0 hunks)
  • ui/app/guardrails/providers/page.tsx (0 hunks)
  • ui/app/layout.tsx (1 hunks)
  • ui/app/login/layout.tsx (1 hunks)
  • ui/app/login/page.tsx (1 hunks)
  • ui/app/page.tsx (1 hunks)
  • ui/app/providers/views/utils.ts (0 hunks)
  • ui/app/workspace/adaptive-routing/page.tsx (1 hunks)
  • ui/app/workspace/alert-channels/page.tsx (1 hunks)
  • ui/app/workspace/audit-logs/page.tsx (1 hunks)
  • ui/app/workspace/config/page.tsx (3 hunks)
  • ui/app/workspace/config/views/featureTogglesView.tsx (1 hunks)
  • ui/app/workspace/config/views/securityView.tsx (6 hunks)
  • ui/app/workspace/guardrails/configuration/page.tsx (1 hunks)
  • ui/app/workspace/guardrails/page.tsx (1 hunks)
  • ui/app/workspace/guardrails/providers/page.tsx (1 hunks)
  • ui/app/workspace/layout.tsx (1 hunks)
  • ui/app/workspace/logs/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1 hunks)
  • ui/app/workspace/page.tsx (1 hunks)
  • ui/app/workspace/plugins/fragments/pluginFormFragments.tsx (2 hunks)
  • ui/app/workspace/plugins/views/pluginsView.tsx (2 hunks)
  • ui/app/workspace/providers/page.tsx (2 hunks)
  • ui/app/workspace/providers/views/modelProviderConfig.tsx (1 hunks)
  • ui/app/workspace/providers/views/utils.ts (1 hunks)
  • ui/app/workspace/scim/page.tsx (1 hunks)
  • ui/app/workspace/user-groups/page.tsx (1 hunks)
  • ui/components/sidebar.tsx (8 hunks)
  • ui/components/ui/promoCardStack.tsx (2 hunks)
  • ui/lib/store/apis/baseApi.ts (2 hunks)
  • ui/lib/store/apis/index.ts (1 hunks)
  • ui/lib/store/apis/sessionApi.ts (1 hunks)
  • ui/lib/store/slices/index.ts (1 hunks)
  • ui/lib/store/store.ts (1 hunks)
  • ui/lib/types/config.ts (1 hunks)
  • ui/next.config.ts (1 hunks)
  • ui/tsconfig.json (1 hunks)
📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Restructured application UI navigation, with all main routes now organized under the /workspace path for improved organization.
  • Improvements

    • Optimized HTTP request handling infrastructure across all AI provider integrations for better performance and reliability.
    • Enhanced testing infrastructure with automated test reporting and HTML report generation capabilities.
    • Added AWS Signature Version 4 support for improved cloud provider integration.
  • Bug Fixes

    • Corrected internal module path from scheams to schemas.
  • Chores

    • Updated dependencies across multiple modules for compatibility and security.

Walkthrough

This PR migrates provider streaming implementations from net/http to fasthttp across 14+ provider implementations, consolidates redundant streaming HTTP clients, introduces AWS Signature Version 4 signing for fasthttp requests, corrects module naming, updates dependencies, and reorganizes frontend routing under a workspace path hierarchy.

Changes

Cohort / File(s) Summary
Provider fasthttp migrations
core/providers/anthropic/anthropic.go, core/providers/openai/openai.go, core/providers/cohere/cohere.go, core/providers/gemini/gemini.go
Replaced net/http streaming client with fasthttp; removed streamClient field; migrated request construction, response handling, and error parsing to fasthttp API (AcquireRequest/Response, SetRequestURI, SetBody, StatusCode(), BodyStream()); updated error handling to use fasthttp timeout and status semantics.
Provider streaming consolidation
core/providers/azure/azure.go, core/providers/cerebras.go, core/providers/groq.go, core/providers/mistral/mistral.go, core/providers/ollama.go, core/providers/openrouter.go, core/providers/parasail.go, core/providers/sgl.go
Removed streamClient field from provider structs; consolidated streaming to use shared fasthttp provider.client; updated client configuration with increased connection limits (MaxConnsPerHost: 10000, MaxIdleConnDuration: 60s, MaxConnWaitTimeout: 10s).
Bedrock provider overhaul
core/providers/bedrock/signer.go, core/providers/bedrock/chat.go
Added public function signAWSRequestFastHTTP implementing AWS Signature Version 4 for fasthttp requests with credential handling, HMAC-SHA256 utilities, and signing key caching; removed anthropic package import from chat.go.
Vertex provider refactor
core/providers/vertex/vertex.go
Added client *fasthttp.Client field; replaced getAuthClient with getAuthTokenSource returning oauth2.TokenSource; migrated all request/response handling from net/http to fasthttp; consolidated token acquisition to occur prior to requests.
OpenAI shared streaming handlers
core/providers/openai/openai.go
Updated function signatures: ParseOpenAIError, HandleOpenAITextCompletionStreaming, HandleOpenAIChatCompletionStreaming, HandleOpenAIResponsesStreaming, HandleOpenAIEmbeddingRequest, HandleOpenAISpeechStreaming, HandleOpenAITranscriptionStreaming and others to accept *fasthttp.Client instead of *http.Client; migrated all SSE parsing and streaming logic to fasthttp.
Anthropic/Azure shared streaming handlers
core/providers/anthropic/anthropic.go, core/providers/azure/azure.go
Updated function signature: parseStreamAnthropicError(resp *fasthttp.Response, ...) and HandleAnthropicChatCompletionStreaming(ctx, ..., client *fasthttp.Client, ...) to operate with fasthttp client and response types.
Gemini streaming error handling
core/providers/gemini/gemini.go
Updated parseStreamGeminiError and parseGeminiError to accept *fasthttp.Response; migrated SpeechStream and TranscriptionStream from net/http to fasthttp request/response lifecycle.
Test and config updates
tests/core-providers/config/account.go
Added package constant Concurrency = 4; replaced hard-coded concurrency literals (10, 2, 1) with Concurrency constant across all provider configurations.
Module and dependency corrections
scheams/go.mod, core/go.mod, framework/go.mod, plugins/logging/go.mod, plugins/otel/go.mod, plugins/telemetry/go.mod, transports/go.mod
Corrected module path: scheamsschemas; promoted github.com/aws/smithy-go v1.23.1 to direct dependency in core/go.mod; updated transitive dependencies (redis/go-redis v9.12.1 → v9.14.0, weaviate 1.31.5 → 1.33.1, grpc v1.75.0 → v1.76.0, AWS SDK patches).
Frontend layout refactor
ui/app/layout.tsx, ui/app/clientLayout.tsx, ui/app/page.tsx
Extracted AppContent and providers (WebSocketProvider, SidebarProvider, ReduxProvider, etc.) into new ClientLayout component; simplified root layout.tsx to wrap ClientLayout; changed redirect from /logs to workspace/logs.
Frontend routing migration
ui/app/workspace/logs/page.tsx, ui/app/workspace/mcp-clients/page.tsx, ui/app/workspace/mcp-clients/views/*, ui/app/workspace/plugins/*, ui/app/workspace/providers/page.tsx, ui/components/sidebar.tsx
Updated import paths and navigation URLs from top-level routes (/logs, /providers, /mcp-clients, etc.) to workspace-prefixed paths (/workspace/logs, /workspace/providers, /workspace/mcp-clients, etc.); updated tsconfig.json and next.config.ts path aliases.
Build system enhancements
Makefile
Added targets: install-gotestsum, install-junit-viewer, clean-test-reports, generate-html-reports; updated test targets to depend on gotestsum; added XML and optional HTML test report generation; improved environment variable documentation.
Utility documentation
core/providers/utils/utils.go
Added descriptive comment block above MakeRequestWithContext detailing non-cancellation semantics and latency calculation; no functional changes.
Build configuration
.gitignore
Added test-reports/ directory to ignore list with comment prefix.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

  • core/providers/openai/openai.go: Largest refactor; comprehensive migration from net/http to fasthttp across 9+ handler functions; verify all streaming paths (SSE parsing, body stream consumption, error handling) correctly use fasthttp lifecycle.
  • core/providers/vertex/vertex.go: Introduces token source pattern replacing HTTP client pooling; verify oauth2.TokenSource integration with fasthttp request flow and Bearer token injection.
  • core/providers/bedrock/signer.go: New AWS Signature V4 implementation; review cryptographic correctness, header canonicalization, key derivation caching, and compliance with AWS signing spec.
  • ui/components/sidebar.tsx and routing updates: Verify all workspace-prefixed routes align with backend routing and NextJS configuration; check for missed import path updates.
  • Dependency upgrades: Cross-reference AWS SDK v2 components (smithy-go promotion) and grpc version changes for compatibility impacts across the codebase.

Suggested reviewers

  • danpiths

Poem

🐰 From http to fasthttp we hop with glee,
Streaming clients consolidated—one client free!
AWS signatures signed with cryptographic flair,
Workspace paths leading everywhere,
Buildtools enhanced for tests so fair! ✨

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "move all providers to use fasthttp even for streaming" directly and clearly summarizes the main refactoring objective evident in the changeset. The title is specific and concise, avoiding vague terminology. The raw summary confirms that the core changes involve replacing http.Client with fasthttp.Client across all provider streaming implementations and removing redundant streamClient fields, which aligns precisely with the title's stated goal.
Description Check ✅ Passed The PR description follows the required template structure comprehensively. It includes all major sections: a clear summary explaining the purpose and the solved issue (#269), detailed changes listing the key modifications (replacing http.Client with fasthttp.Client, removing streamClient fields, updating error handling, refactoring request/response handling, and fixing the schemas module typo), proper type categorization (Refactor), affected areas (Core and Providers/Integrations), testing instructions, a clear statement on breaking changes (No), and a largely completed checklist with appropriate explanations. While documentation updates are not checked, this is reasonably justified for an internal refactoring PR and is marked transparently in the checklist.

Comment @coderabbitai help to get the list of available commands and usage tips.

This was referenced Oct 27, 2025
Copy link
Contributor Author

akshaydeo commented Oct 27, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@akshaydeo akshaydeo mentioned this pull request Oct 27, 2025
18 tasks
@akshaydeo akshaydeo marked this pull request as ready for review October 27, 2025 22:55
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from ba3fe92 to 582f0a6 Compare October 27, 2025 23:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
core/providers/gemini/gemini.go (3)

23-27: Remove streamClient; use the single fasthttp client everywhere.

Keep only provider.client. Drop streamClient field/init and replace usages with provider.client.

 type GeminiProvider struct {
   logger               schemas.Logger                // Logger for provider operations
   client               *fasthttp.Client              // HTTP client for API requests
-  streamClient         *fasthttp.Client              // HTTP client for streaming requests
   networkConfig        schemas.NetworkConfig         // Network configuration including extra headers
   sendBackRawResponse  bool                          // Whether to include raw response in BifrostResponse
   customProviderConfig *schemas.CustomProviderConfig // Custom provider config
 }
- // Initialize streaming HTTP client
- streamClient := &fasthttp.Client{
-   ReadTimeout:     time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-   WriteTimeout:    time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-   MaxConnsPerHost: config.ConcurrencyAndBufferSize.Concurrency,
- }
- client = providerUtils.ConfigureProxy(client, config.ProxyConfig, logger)
- streamClient = providerUtils.ConfigureProxy(streamClient, config.ProxyConfig, logger)
+ client = providerUtils.ConfigureProxy(client, config.ProxyConfig, logger)
   return &GeminiProvider{
     logger:               logger,
     client:               client,
-    streamClient:         streamClient,
     networkConfig:        config.NetworkConfig,
     customProviderConfig: config.CustomProviderConfig,
     sendBackRawResponse:  config.SendBackRawResponse,
   }
   return openai.HandleOpenAIChatCompletionStreaming(
     ctx,
-    provider.streamClient,
+    provider.client,
     provider.networkConfig.BaseURL+"/openai/chat/completions",
     request,
     map[string]string{"Authorization": "Bearer " + key.Value},
     provider.networkConfig.ExtraHeaders,
     provider.sendBackRawResponse,
     provider.GetProviderKey(),
     postHookRunner,
     provider.logger,
   )
- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.streamClient, req, resp)
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.streamClient, req, resp)
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)

Also applies to: 41-51, 58-66, 242-255, 399-406, 673-676


399-416: Compile error: using undefined variable err after MakeRequestWithContext.

Replace references to err with returning bifrostErr directly.

- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) {
-     return nil, &schemas.BifrostError{ ... }
-   }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-   }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

Apply the same change in TranscriptionStream.

Also applies to: 673-690


939-963: Fix resp.StatusCode usage in parseStreamGeminiError.

Use StatusCode() method; also keep consistent int conversion.

- StatusCode:     schemas.Ptr(int(resp.StatusCode())),
+ StatusCode:     schemas.Ptr(int(resp.StatusCode())),
...
- return providerUtils.NewBifrostOperationError(fmt.Sprintf("Gemini streaming error (HTTP %d): %v", resp.StatusCode, rawResponse), fmt.Errorf("HTTP %d", resp.StatusCode), providerName)
+ return providerUtils.NewBifrostOperationError(
+   fmt.Sprintf("Gemini streaming error (HTTP %d): %v", resp.StatusCode(), rawResponse),
+   fmt.Errorf("HTTP %d", resp.StatusCode()),
+   providerName,
+ )
core/providers/vertex/vertex.go (2)

159-176: Compile error: using undefined err after MakeRequestWithContext.

Return bifrostErr directly instead of checking errors.Is(err,...).

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

Apply in all three locations.

Also applies to: 335-351, 632-647


508-531: Use OAuth token for non‑Claude streaming auth (match non‑streaming path).

Rely on getAuthTokenSource(key) and set Authorization: Bearer <access_token>.

- authHeader := map[string]string{}
- if key.Value != "" {
-   authHeader["Authorization"] = "Bearer " + key.Value
- }
+ tokenSource, err := getAuthTokenSource(key)
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+ }
+ token, err := tokenSource.Token()
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
+ }
+ authHeader := map[string]string{"Authorization": "Bearer " + token.AccessToken}
core/providers/openai/openai.go (3)

321-338: Incorrect error variable and unreachable branches after MakeRequestWithContext.

You check errors.Is(err, …) but err is undefined (or always nil in TranscriptionStream). Also, MakeRequestWithContext already maps cancellations/timeouts. This is a compile-time/logic bug.

Apply this pattern across all streaming handlers:

- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, client, req, resp)
- if bifrostErr != nil {
-     if errors.Is(err, context.Canceled) {
-         return nil, &schemas.BifrostError{...}
-     }
-     if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-         return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-     }
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, client, req, resp)
+ if bifrostErr != nil {
+     return nil, bifrostErr
+ }

This removes dead code and fixes the compile error.

Also applies to: 669-685, 1019-1035, 1369-1385, 1634-1651


184-188: Use-after-release: releasing req/resp before the streaming goroutine finishes.

You defer fasthttp.ReleaseRequest/Response in the parent function, but read resp.BodyStream() inside a goroutine. The defers run when the parent returns, causing the goroutine to use released objects.

Move the Release calls into the streaming goroutine so they run after streaming finishes:

- req := fasthttp.AcquireRequest()
- resp := fasthttp.AcquireResponse()
- defer fasthttp.ReleaseRequest(req)
- defer fasthttp.ReleaseResponse(resp)
+ req := fasthttp.AcquireRequest()
+ resp := fasthttp.AcquireResponse()
  ...
- // Start streaming in a goroutine
- go func() {
+ // Start streaming in a goroutine
+ go func() {
+     defer fasthttp.ReleaseRequest(req)
+     defer fasthttp.ReleaseResponse(resp)
      defer close(responseChan)
      ...
  }()

Apply to all streaming handlers.

Also applies to: 528-532, 878-882, 1340-1344, 1604-1608


301-339: Replace fasthttp.Client.Do with net/http for streaming endpoints to enable true incremental SSE processing.

The review comment is correct. fasthttp's client APIs (Client/HostClient.Do / Get / etc.) aggregate the response into memory and do not provide a general incremental streaming API like net/http's Response.Body. The current code path blocks until the entire response is buffered, then attempts to stream from the buffer—this eliminates the benefit of SSE (receiving tokens progressively) and causes unnecessary latency and memory overhead.

At lines 301-339 (and 648-686, 998-1035, 1369-1385, 1634-1651):

  • MakeRequestWithContext blocks on client.Do(req, resp) until the full response is received
  • resp.BodyStream() then reads from an already-buffered response, not a live stream
  • The first token is unavailable until the entire response has arrived

The maintainer guidance is to use net/http for body streaming. Switch to net/http's streaming-capable client for these endpoints, or implement a custom fasthttp streaming wrapper that doesn't buffer the entire response.

🧹 Nitpick comments (10)
core/providers/ollama.go (1)

31-35: Use Concurrency for MaxConnsPerHost (not BufferSize).

Azure uses Concurrency; keep consistency to avoid over/under-provisioning connections.

-    MaxConnsPerHost: config.ConcurrencyAndBufferSize.BufferSize,
+    MaxConnsPerHost: config.ConcurrencyAndBufferSize.Concurrency,
core/providers/parasail.go (2)

18-23: Remove streamClient and reuse client for streaming.

This file still maintains a separate streamClient, diverging from the PR goal and adding duplicate config/proxy logic. Simplify to a single client.

 type ParasailProvider struct {
   logger              schemas.Logger
   client              *fasthttp.Client
-  streamClient        *fasthttp.Client
   networkConfig       schemas.NetworkConfig
   sendBackRawResponse bool
 }
@@
- // Initialize streaming HTTP client
- streamClient := &fasthttp.Client{
-   ReadTimeout:     time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-   WriteTimeout:    time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-   MaxConnsPerHost: config.ConcurrencyAndBufferSize.BufferSize,
- }
@@
- streamClient = providerUtils.ConfigureProxy(streamClient, config.ProxyConfig, logger)
@@
   return &ParasailProvider{
     logger:              logger,
     client:              client,
-    streamClient:        streamClient,
     networkConfig:       config.NetworkConfig,
     sendBackRawResponse: config.SendBackRawResponse,
   }, nil
@@
-  provider.streamClient,
+  provider.client,

Also applies to: 37-48, 116-121


31-35: Use Concurrency for MaxConnsPerHost (not BufferSize).

Align with other providers to reflect intended parallelism controls.

-    MaxConnsPerHost: config.ConcurrencyAndBufferSize.BufferSize,
+    MaxConnsPerHost: config.ConcurrencyAndBufferSize.Concurrency,
core/providers/cerebras.go (1)

30-34: Use Concurrency for MaxConnsPerHost (not BufferSize).

Keep consistent with Azure and intended connection scaling.

-    MaxConnsPerHost: config.ConcurrencyAndBufferSize.BufferSize,
+    MaxConnsPerHost: config.ConcurrencyAndBufferSize.Concurrency,
core/providers/sgl.go (1)

52-57: Consistency and tiny cleanup: trim whitespace, use GetProviderKey consistently.

  • Line 54 has stray whitespace.
  • In ChatCompletionStream, pass provider.GetProviderKey() (as in other methods) instead of schemas.SGL for consistency/custom provider naming.

Apply:

   return &SGLProvider{
     logger:              logger,
-    client:              client,		
+    client:              client,
     networkConfig:       config.NetworkConfig,
     sendBackRawResponse: config.SendBackRawResponse,
   }, nil
   return openai.HandleOpenAIChatCompletionStreaming(
     ctx,
-    provider.client,
+    provider.client,
     provider.networkConfig.BaseURL+"/v1/chat/completions",
     request,
     nil,
     provider.networkConfig.ExtraHeaders,
     provider.sendBackRawResponse,
-    schemas.SGL,
+    provider.GetProviderKey(),
     postHookRunner,
     provider.logger,
   )

Also applies to: 133-146

core/providers/vertex/vertex.go (1)

33-55: Dead code: vertexClientPool/removeVertexClient are now no‑ops.

If client caching is gone, delete the pool and removeVertexClient usages.

core/providers/bedrock/signer.go (2)

174-176: Payload hash source may diverge from actual request body.

You hash the body parameter. If callers mutate req body after signing or pass a mismatched slice, the signature will be invalid.

  • If body == nil, read from req.Body() to compute the hash.
  • Document and enforce calling order: set body before signing; signing must be the last step before Do().
    I can add a small guard and unit tests if helpful.

Also applies to: 130-137


177-183: Signing helper should avoid setting non-signing headers.

Setting Content-Type/Accept here couples concerns and can surprise callers.

  • Remove these header mutations from the signer.
  • Let callers set Content-Type/Accept. Keep only X-Amz-Date/X-Amz-Security-Token/Authorization here.
core/providers/openai/openai.go (2)

301-320: Minor: align header order and avoid mixed responsibility.

Set method/URI first, then apply provider extra headers (once), then auth/SSE headers, then body. This reduces accidental overwrites.

If you want, I can generate a small helper to DRY this across handlers.

Also applies to: 648-667, 998-1017, 1345-1367, 1609-1633


1834-1860: Duplicate error parsing helpers.

parseStreamOpenAIError duplicates ParseOpenAIError logic.

Unify by calling ParseOpenAIError(resp, …) for both streaming and non‑streaming paths, or have a single internal helper.

Also applies to: 1803-1832

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d7517f and 582f0a6.

📒 Files selected for processing (15)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (4 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (4 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (1 hunks)
  • core/providers/mistral/mistral.go (1 hunks)
  • core/providers/ollama.go (2 hunks)
  • core/providers/openai/openai.go (22 hunks)
  • core/providers/openrouter.go (5 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (4 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (9)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (327-338)
core/providers/parasail.go (2)
core/schemas/provider.go (4)
  • NetworkConfig (34-42)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ConcurrencyAndBufferSize (53-56)
  • ProxyConfig (79-84)
core/providers/utils/utils.go (1)
  • ConfigureProxy (95-143)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (232-276)
  • NewUnsupportedOperationError (280-291)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (280-291)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (95-143)
  • SetExtraHeaders (151-172)
  • NewBifrostOperationError (327-338)
  • MakeRequestWithContext (35-90)
  • HandleProviderResponse (232-276)
  • NewProviderAPIError (342-357)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/anthropic/anthropic.go (1)
  • HandleAnthropicChatCompletionStreaming (384-566)
core/providers/gemini/gemini.go (3)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (4)
  • ConfigureProxy (95-143)
  • SetExtraHeaders (151-172)
  • MakeRequestWithContext (35-90)
  • NewBifrostOperationError (327-338)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/anthropic/anthropic.go (1)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (151-172)
  • MakeRequestWithContext (35-90)
  • NewBifrostOperationError (327-338)
core/providers/bedrock/bedrock.go (2)
core/schemas/bifrost.go (1)
  • RequestCancelled (282-282)
core/schemas/provider.go (1)
  • ErrRequestCancelled (23-23)
core/providers/openai/openai.go (2)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (151-172)
  • MakeRequestWithContext (35-90)
  • NewBifrostOperationError (327-338)
  • HandleProviderAPIError (206-226)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (11)
core/providers/mistral/mistral.go (2)

16-56: LGTM: Clean consolidation to a single fasthttp client.

The removal of the streamClient field and the use of a single fasthttp.Client for both regular and streaming operations simplifies the provider structure. The client configuration with appropriate timeouts and concurrency limits is suitable for both use cases.


144-162: Verify streaming functionality with fasthttp client.

The change correctly uses provider.client for streaming operations, consolidating all HTTP operations to use the same fasthttp client. This is consistent with the non-streaming ChatCompletion method and aligns with the PR objectives.

Please verify that the Mistral streaming endpoint works correctly with the fasthttp client by testing chat completion streaming operations as outlined in the PR testing instructions.

core/providers/openrouter.go (2)

216-216: LGTM: Error messages now use dynamic provider key.

Replacing the hardcoded "openrouter" string with string(provider.GetProviderKey()) improves consistency and maintainability across all unsupported operations.

Also applies to: 221-221, 226-226, 231-231, 236-236


135-135: LGTM: Streaming operations successfully migrated to unified fasthttp client.

The consolidation from provider.streamClient to provider.client correctly passes *fasthttp.Client to all streaming handlers. Verification confirms that HandleOpenAITextCompletionStreaming, HandleOpenAIChatCompletionStreaming, and HandleOpenAIResponsesStreaming in the openai package all accept *fasthttp.Client as the client parameter, making this migration fully compatible.

core/providers/groq.go (1)

177-177: LGTM! Change correctly consolidates streaming to use shared fasthttp client.

The migration from a separate streamClient to the unified provider.client aligns with the PR objective. The change is syntactically correct and consistent with the broader refactoring pattern across all providers.

core/providers/azure/azure.go (3)

112-116: Good defensive copy before releasing fasthttp.Response.

Copying resp.Body() avoids UAF with pooled buffers. Keep this pattern consistently elsewhere.


436-439: Apply bodyCopy pattern to line 436 for consistency with existing safe pattern in the same file.

Line 436 in azure.go passes resp.Body() directly to HandleProviderResponse while resp is pooled via defer fasthttp.ReleaseResponse(resp) at line 405. This is inconsistent with the pattern already established at line 176 of the same file, which safely copies the response body before passing to HandleProviderResponse:

-	rawResponse, bifrostErr := providerUtils.HandleProviderResponse(resp.Body(), response, provider.sendBackRawResponse)
+	responseBody := append([]byte(nil), resp.Body()...)
+	rawResponse, bifrostErr := providerUtils.HandleProviderResponse(responseBody, response, provider.sendBackRawResponse)

This ensures the handler doesn't hold a reference to fasthttp's pooled buffer after resp is released.


35-39: Verify timeout configuration for streaming responses.

The Azure provider supports Server-Sent Events streaming via ChatCompletionStream, TextCompletionStream, and ResponsesStream methods. All streaming operations use the shared provider.client created at lines 35-39, which applies DefaultRequestTimeoutInSeconds (default 30 seconds) to both ReadTimeout and WriteTimeout.

In fasthttp, these timeouts apply to the entire request-response cycle. For long-running SSE streams, a 30-second timeout may prematurely terminate the connection. Ensure DefaultRequestTimeoutInSeconds is configured appropriately for your expected stream durations, or consider implementing a streaming-specific client with a higher or zero timeout (e.g., ReadTimeout: 0 for indefinite reads).

core/providers/ollama.go (1)

92-104: LGTM on streaming migration.

Streaming now reuses provider.client with fasthttp and shared helpers. Matches PR objective.

Also applies to: 126-139

core/providers/cerebras.go (1)

84-96: LGTM on streaming migration.

Unified on provider.client; no separate stream client. Matches PR objective.

Also applies to: 117-131

core/providers/bedrock/bedrock.go (1)

254-265: Formatting-only change; no behavior impact.

The reflowed error-handling block reads cleaner; logic unchanged.

@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from 582f0a6 to 99d48da Compare October 28, 2025 06:53
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
core/providers/vertex/vertex.go (4)

162-178: Wrong error variable; just return bifrostErr from MakeRequestWithContext.

This block checks err from a prior token call, not the HTTP error. It masks real causes.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) {
-     return nil, &schemas.BifrostError{ ... }
-   }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-   }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

337-353: Same issue: misusing err after MakeRequestWithContext.

Return bifrostErr directly to preserve accurate context/timeout typing.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

516-524: Non-Claude streaming uses wrong Authorization; must use OAuth access token.

Vertex OpenAI endpoints require Google OAuth Bearer, not key.Value. Acquire via token source like other calls.

- authHeader := map[string]string{}
- if key.Value != "" {
-   authHeader["Authorization"] = "Bearer " + key.Value
- }
+ tokenSource, err := getAuthTokenSource(key)
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+ }
+ token, err := tokenSource.Token()
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
+ }
+ authHeader := map[string]string{
+   "Authorization": "Bearer " + token.AccessToken,
+ }

634-649: Same issue in Embedding: return bifrostErr instead of re-mapping with stale err.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }
♻️ Duplicate comments (13)
core/providers/bedrock/signer.go (3)

193-197: Content-Length signing can drift from actual header.

You include "content-length" in signed headers from len(body), but the actual header is set by fasthttp when the body is attached (or may be absent for GET). If the request body changes after signing, signatures can break.

Apply this change:

  • Only include content-length if it's already present on req.Header (use req.Header.ContentLength() >= 0).
  • Otherwise, don't add it to signed headers.

Example:

- if len(body) > 0 {
-     headerNames = append(headerNames, "content-length")
-     headerMap["content-length"] = []string{strconv.Itoa(len(body))}
- }
+ if cl := req.Header.ContentLength(); cl >= 0 {
+     headerNames = append(headerNames, "content-length")
+     headerMap["content-length"] = []string{strconv.Itoa(cl)}
+ }

200-217: fasthttp header iteration: req.Header.All() doesn't exist (compile-time). Use VisitAll.

Replace the iteration with fasthttp's RequestHeader.VisitAll. This is a correctness blocker.

Apply this diff:

-    // Collect other headers
-    for key, value := range req.Header.All() {
-        keyStr := strings.ToLower(string(key))
-
-        // Skip ignored headers
-        if _, ignore := ignoredHeaders[keyStr]; ignore {
-            continue
-        }
-
-        // Skip if already handled
-        if keyStr == "host" || keyStr == "content-length" {
-            continue
-        }
-
-        if _, exists := headerMap[keyStr]; !exists {
-            headerNames = append(headerNames, keyStr)
-        }
-        headerMap[keyStr] = append(headerMap[keyStr], string(value))
-    }
+    // Collect other headers
+    req.Header.VisitAll(func(k, v []byte) {
+        keyStr := strings.ToLower(string(k))
+        // Skip ignored headers
+        if _, ignore := ignoredHeaders[keyStr]; ignore {
+            return
+        }
+        // Skip if already handled
+        if keyStr == "host" || keyStr == "content-length" {
+            return
+        }
+        if _, exists := headerMap[keyStr]; !exists {
+            headerNames = append(headerNames, keyStr)
+        }
+        headerMap[keyStr] = append(headerMap[keyStr], string(v))
+    })

242-259: Canonical query construction violates AWS SigV4 RFC 3986 encoding requirements.

The current implementation conflates x-www-form-urlencoded (which encodes space as '+') with RFC 3986 (which encodes space as '%20'). url.ParseQuery() decodes form-encoded data where '+' represents space; values.Encode() re-encodes to form format (space as '+'); ReplaceAll("+", "%20") is a fragile post-hoc patch.

This breaks on edge cases: a literal '+' in a parameter value gets decoded to space, re-encoded as '+', then replaced with '%20'—corrupting the data. AWS SigV4 requires RFC 3986 percent-encoding where each query parameter name and value are encoded separately (space as %20, not '+'), then sorted lexicographically by encoded name.

Replace lines 242–259 with manual RFC 3986 encoding: split the query string by '&', percent-encode each name/value pair per RFC 3986 (preserve unreserved chars A–Z a–z 0–9 - _ . ~, encode everything else as %HH), sort by encoded key, join with '&'. Add tests against AWS SigV4 canonicalization test vectors to verify compliance.

core/providers/azure/azure.go (1)

38-40: Same MaxConnsPerHost concern as Groq.

Verify this high connection limit is appropriate for Azure's rate limits and infrastructure.

core/providers/sgl.go (1)

34-36: Same MaxConnsPerHost verification needed.

Consistent with other providers but worth verifying.

core/providers/anthropic/anthropic.go (1)

676-693: Fix undefined err variable references.

Line 676 assigns bifrostErr from MakeRequestWithContext, but lines 679, 689, and 692 reference undefined variable err. Since MakeRequestWithContext returns (time.Duration, *schemas.BifrostError) rather than a raw error, simply return bifrostErr directly when non-nil:

  _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
  if bifrostErr != nil {
    defer fasthttp.ReleaseResponse(resp)
-   if errors.Is(err, context.Canceled) {
-     return nil, &schemas.BifrostError{
-       IsBifrostError: false,
-       Error: &schemas.ErrorField{
-         Type:    schemas.Ptr(schemas.RequestCancelled),
-         Message: schemas.ErrRequestCancelled,
-         Error:   err,
-       },
-     }
-   }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-   }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+   return nil, bifrostErr
  }
core/providers/vertex/vertex.go (1)

145-151: ListModels must use GET (not POST) and no JSON Content-Type.

POST likely 405/400s here. Switch to GET and drop Content-Type.

- req.Header.SetMethod(http.MethodPost)
+ req.Header.SetMethod(http.MethodGet)
  req.SetRequestURI(requestURL)
- req.Header.SetContentType("application/json")
+ // No body; Content-Type not needed for GET
+ req.Header.Del("Content-Type")

Also applies to: 147-148

core/providers/openai/openai.go (6)

355-356: Raise bufio.Scanner limit to avoid truncated SSE lines.

Default 64KB token limit can break long chunks. Increase buffer.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

713-714: Same: increase Scanner buffer for ChatCompletion streaming.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

1073-1074: Same: increase Scanner buffer for Responses streaming.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

1433-1434: Same: increase Scanner buffer for Speech streaming.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

1703-1704: Same: increase Scanner buffer for Transcription streaming.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

1652-1661: TranscriptionStream: preserve multipart Content-Type; remove duplicate extra headers.

You overwrite writer.FormDataContentType() with application/json and set extra headers twice.

- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(url)
- req.Header.SetContentType("application/json")
-
- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+ // Method/URI first
+ req.Header.SetMethod(http.MethodPost)
+ req.SetRequestURI(url)
+ // Apply provider extra headers once
+ providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)

Headers map already carries writer.FormDataContentType(); keep it.

Also applies to: 1655-1658

🧹 Nitpick comments (8)
Makefile (5)

19-22: Unused variable: TEST_REPORT_FORMAT is defined but never referenced.

The variable TEST_REPORT_FORMAT is declared at line 22 but never used in any target or command. Either remove it or integrate it into the test reporting workflow.


63-83: Refactor CI environment detection into a reusable helper function.

The conditional checking for CI environments (CI, GITHUB_ACTIONS, GITLAB_CI, CIRCLECI, JENKINS_HOME) is repeated identically across multiple targets (lines 68, 181, 226, etc.). Extract this pattern into a Makefile function to reduce duplication and improve maintainability.

# Near the top of the file, after the color definitions
define is_ci_environment
[ -z "$$CI" ] && [ -z "$$GITHUB_ACTIONS" ] && [ -z "$$GITLAB_CI" ] && [ -z "$$CIRCLECI" ] && [ -z "$$JENKINS_HOME" ]
endef

Then replace repeated patterns with:

if $(is_ci_environment); then \
  # Run local HTML generation logic \
fi

174-200: Target body exceeds recommended complexity; consider extracting helper logic.

The test target has grown to 26 lines of body, which exceeds Makefile best practices (checkmake flags 5+ line bodies). While the expanded functionality is valuable for reporting, consider extracting the junit-viewer HTML generation into a shared helper target or Makefile function to reduce duplication across test, test-core, test-plugins, and test-all.

This will also address the repeated CI environment checks and HTML generation logic that appears in lines 181–200, 226–261, 273–288, 301–313, and 321–338.


181-200: Silent error suppression may hide junit-viewer failures; consider logging or retry logic.

Multiple targets suppress junit-viewer errors with 2>/dev/null || true (lines 184, 227, 250, 274, 304, 323). While graceful degradation is good, silent failures can obscure misconfigurations or transient issues. Consider:

  • Logging the error message before silencing it (e.g., 2>&1 | grep -v "error message" || true)
  • Adding a debug flag (e.g., DEBUG=1) to enable error visibility
  • Failing the build if junit-viewer is available but errors (distinguish install failure from runtime failure)

This is a minor UX concern for local development, as XML reports remain available.

Also applies to: 226-261, 273-288, 301-313, 321-338


172-172: Potential fragility in sed path substitution; consider using printf for safety.

Line 172 uses sed to transform paths in a loop:

ls -1 $(TEST_REPORTS_DIR)/*.html 2>/dev/null | sed 's|$(TEST_REPORTS_DIR)/|  open $(TEST_REPORTS_DIR)/|' || true

If $(TEST_REPORTS_DIR) contains special regex characters or spaces, the sed substitution may fail or misbehave. For robustness, consider using printf or awk instead:

ls -1 $(TEST_REPORTS_DIR)/*.html 2>/dev/null | while read f; do echo "  open $$f"; done || true

This is a low-risk edge case, but worth addressing if test-reports paths are ever user-configurable.

core/providers/ollama.go (1)

39-42: Consider removing commented-out code.

The commented-out response pool pre-warming code appears unrelated to the current refactor. Consider removing it or tracking it in a separate issue if it's intended for future implementation.

core/providers/vertex/vertex.go (1)

628-631: Stray duplicate comment lines; delete for clarity.

-	
-// Set any extra headers from network config
-	
-
core/providers/openai/openai.go (1)

244-260: Apply Scanner buffer size fix consistently across all streaming handlers.

Current state: Only Gemini provider implements scanner.Buffer() to handle large chunks (10MB max buffer at lines 428 and 703). All other streaming handlers lack this optimization:

  • OpenAI: 5 locations (lines 355, 713, 1073, 1433, 1703)
  • Anthropic: 2 locations (lines 463, 719)
  • Cohere: 2 locations (lines 348, 614)

Apply Gemini's buffer pattern across all providers to handle large token chunks consistently:

buf := make([]byte, 0, 1024*1024) // 1MB initial buffer
scanner.Buffer(buf, 10*1024*1024)  // Allow up to 10MB tokens
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 582f0a6 and 99d48da.

⛔ Files ignored due to path filters (1)
  • tests/core-providers/go.sum is excluded by !**/*.sum
📒 Files selected for processing (21)
  • .gitignore (1 hunks)
  • Makefile (4 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • tests/core-providers/go.mod (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • core/providers/utils/utils.go
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/providers/bedrock/bedrock.go
  • scheams/go.mod
  • core/providers/cerebras.go
  • core/providers/mistral/mistral.go
  • core/providers/parasail.go
🧰 Additional context used
🧬 Code graph analysis (8)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
core/providers/openrouter.go (2)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/schemas/bifrost.go (1)
  • BifrostError (322-331)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/anthropic/anthropic.go (3)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/anthropic/types.go (1)
  • AnthropicError (303-309)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/openai/openai.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
core/providers/gemini/gemini.go (2)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
🪛 checkmake (0.2.2)
Makefile

[warning] 174-174: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (18)
Makefile (1)

24-24: Verify .PHONY targets are comprehensive and accurate.

The .PHONY declaration includes all new targets (install-gotestsum, install-junit-viewer, clean-test-reports, generate-html-reports), which is correct. However, given the extensive refactoring across test targets, verify that no intermediate helper functions are accidentally treated as file targets if you extract shared logic.

core/providers/ollama.go (1)

96-96: LGTM! Streaming consolidation looks correct.

The migration from a separate streamClient to the shared provider.client for streaming operations is correctly implemented and aligns with the PR objective of consolidating all HTTP operations to fasthttp.

Also applies to: 131-131

core/providers/openrouter.go (3)

92-92: LGTM! Good use of utility function.

The refactor to use HandleProviderResponse improves consistency and leverages concurrent unmarshalling.


137-137: LGTM! Streaming migration to fasthttp is correct.

All streaming methods now correctly use provider.client instead of the removed streamClient. This successfully consolidates to a single fasthttp client for both regular and streaming operations.

Also applies to: 172-172, 203-203


218-218: LGTM! Dynamic provider key improves maintainability.

Replacing hardcoded "openrouter" with string(provider.GetProviderKey()) in error messages is more maintainable and reduces the risk of copy-paste errors across providers.

Also applies to: 223-223, 228-228, 233-233, 238-238

tests/core-providers/config/account.go (1)

16-16: LGTM: Concurrency standardization improves maintainability.

The constant centralizes concurrency configuration across all test providers, replacing various hard-coded values. A value of 4 is reasonable for test scenarios.

core/providers/groq.go (2)

179-179: LGTM: Streaming now uses shared fasthttp client.

The migration from a dedicated streamClient to the shared provider.client is correct and aligns with the PR objective.


33-36: MaxConnsPerHost=10000 is consistent across all providers—no action needed.

All 13 providers in the codebase (vertex, openai, anthropic, azure, cohere, gemini, mistral, groq, cerebras, ollama, openrouter, parasail, sgl) use the identical MaxConnsPerHost setting of 10000. This confirms it is an intentional, standardized architectural decision rather than a Groq-specific concern.

core/providers/azure/azure.go (1)

277-277: LGTM: Azure streaming migrated to fasthttp.

All streaming operations (TextCompletionStream, ChatCompletionStream, ResponsesStream) correctly use provider.client. The migration removes the streamClient dependency.

Also applies to: 362-362, 487-487

tests/core-providers/go.mod (1)

3-3: LGTM: Dependencies updated to support fasthttp migration.

Key updates include:

  • Go toolchain 1.24.0
  • testify v1.11.1
  • sonic v1.14.1
  • fasthttp v1.67.0

These align with the fasthttp migration and maintain compatibility.

Also applies to: 11-11, 34-34, 51-51

core/providers/sgl.go (1)

192-192: LGTM: Dynamic provider names improve maintainability.

Using string(provider.GetProviderKey()) instead of hard-coded "sgl" makes error messages more maintainable, especially for custom provider configurations.

Also applies to: 197-197, 202-202, 207-207

core/providers/cohere/cohere.go (2)

289-346: LGTM: ChatCompletionStream correctly manages fasthttp lifecycle.

The implementation properly:

  • Acquires request/response
  • Defers ReleaseRequest immediately
  • Defers ReleaseResponse inside the streaming goroutine (line 346)
  • Uses client.Do() with proper error handling

This avoids use-after-free by keeping the response valid while the goroutine consumes resp.BodyStream().


560-612: LGTM: ResponsesStream follows the same correct pattern.

Consistent lifecycle management with ChatCompletionStream.

core/providers/gemini/gemini.go (2)

368-423: LGTM: SpeechStream fasthttp lifecycle appears correct.

The response release is deferred inside the goroutine (line 423), ensuring resp remains valid while BodyStream() is consumed (line 425). This addresses the UAF concern from previous reviews.


642-698: LGTM: TranscriptionStream follows the same correct pattern.

Consistent with SpeechStream - ReleaseResponse deferred at line 698 inside the goroutine.

core/providers/anthropic/anthropic.go (1)

402-451: LGTM: HandleAnthropicChatCompletionStreaming correctly manages lifecycle.

This function uses client.Do(req, resp) which returns error (not *schemas.BifrostError), so the err variable checks at lines 420-437 are correct. ReleaseResponse is properly deferred inside the goroutine at line 451.

core/providers/openai/openai.go (1)

38-43: LGTM on fasthttp client timeouts/concurrency.

core/providers/vertex/vertex.go (1)

33-37: The utilities are actively used; the review claim is inaccurate.

The original review's assertion that vertexClientPool, getClientKey, and removeVertexClient are "not used" is incorrect. All three are actively utilized in the codebase:

  • removeVertexClient is called 3 times (lines 182, 358, 655) to clean up invalid clients on auth errors
  • getClientKey is used within removeVertexClient to generate pool keys
  • vertexClientPool is actively used with .Delete() operations for client eviction

However, the underlying issue has merit: the caching implementation is incomplete. The pool only performs deletions; there are no Store() or Load() operations for actual client caching. Either implement the missing caching logic or clarify that this pool is solely for cleanup management.

Likely an incorrect or invalid review comment.

@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from 6d7517f to df11c92 Compare October 28, 2025 20:46
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from 99d48da to af44c0a Compare October 28, 2025 20:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
core/providers/cohere/cohere.go (1)

313-330: Replace direct client.Do() with MakeRequestWithContext() to properly handle context cancellation.

The streaming request at line 313 uses provider.client.Do(req, resp) directly, unlike non-streaming methods that use providerUtils.MakeRequestWithContext(ctx, ...). This means context cancellation is only detected after the request completes, potentially wasting resources if the client disconnects or the context times out.

Replace:

err = provider.client.Do(req, resp)

With:

_, err := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)

This pattern is already used in Anthropic's ResponsesStream method and all non-streaming Cohere methods, and ensures consistent context handling across the codebase.

core/providers/vertex/vertex.go (3)

162-178: Wrong error variable; return bifrostErr directly from MakeRequestWithContext.

Checks use err (token error) instead of the HTTP error. This masks real failures.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

336-353: Same bug: misuse of err after MakeRequestWithContext.

Return bifrostErr as-is.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

633-649: Same bug in Embedding: misuse of err after MakeRequestWithContext.

Return bifrostErr directly.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }
core/providers/anthropic/anthropic.go (1)

676-693: Wrong error variable after MakeRequestWithContext; return bifrostErr.

err refers to JSON marshal, not the HTTP request; this misreports failures.

- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   defer fasthttp.ReleaseResponse(resp)
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   defer fasthttp.ReleaseResponse(resp)
+   return nil, bifrostErr
+ }
core/providers/openai/openai.go (1)

1331-1347: UAF: returning resp.Body() without copying frees memory after defer.

fasthttp reuses buffers; copy before returning.

- // Get the binary audio data from the response body
- audioData := resp.Body()
+ // Copy the binary audio data before releasing the response
+ audioData := append([]byte(nil), resp.Body()...)
♻️ Duplicate comments (8)
core/providers/openrouter.go (1)

35-37: MaxConnsPerHost value already flagged in previous review.

This concern about the excessive MaxConnsPerHost: 10000 value was already raised in the past review comments. Please address the previous feedback regarding reducing this value or documenting the justification.

core/providers/bedrock/signer.go (3)

200-217: Critical: req.Header.All() doesn't exist in fasthttp.

This code will not compile. fasthttp's RequestHeader doesn't have an All() method. Use VisitAll(func(k, v []byte)) instead.


193-197: Content-Length header can drift from actual request.

Including content-length in signed headers based on len(body) can cause signature mismatches if fasthttp sets a different value when transmitting the request.


242-259: Canonical query string uses form encoding instead of RFC 3986.

url.ParseQuery() and values.Encode() use application/x-www-form-urlencoded (space as '+'), and the ReplaceAll("+", "%20") workaround is fragile. AWS SigV4 requires RFC 3986 encoding where spaces are %20 and literal '+' is %2B. This breaks on edge cases like literal '+' in parameter values.

core/providers/cohere/cohere.go (1)

560-598: Same issues as ChatCompletionStream: redundant header setting and missing context.

This method has the same issues identified in ChatCompletionStream:

  1. Duplicate Content-Type header: Lines 568 and 572 both set Content-Type to application/json.
  2. Empty/spacing line: Line 570 should be cleaned up.
  3. Missing context in Do call: Line 581 uses provider.client.Do(req, resp) without context propagation (same concern as raised for ChatCompletionStream at line 313).

Apply similar fixes as suggested for ChatCompletionStream:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
-	req.Header.SetContentType("application/json")
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-		
+
 	// Set headers
 	req.Header.Set("Content-Type", "application/json")
core/providers/openai/openai.go (2)

355-356: Increase bufio.Scanner token limit for SSE across all OpenAI streaming handlers.

Prevents “token too long” on large JSON lines.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

(Apply in TextCompletionStream, ChatCompletionStream, ResponsesStream, SpeechStream, TranscriptionStream.)

Also applies to: 713-714, 1073-1074, 1433-1434, 1703-1704


1652-1667: TranscriptionStream: wrong Content-Type and duplicated header/extra-header setup.

You overwrite multipart with application/json and set headers twice.

- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(url)
- req.Header.SetContentType("application/json")
-
- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+ // Set method/URI first
+ req.Header.SetMethod(http.MethodPost)
+ req.SetRequestURI(url)
+ // Apply provider extra headers once
+ providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 // Set headers
 for key, value := range headers {
   req.Header.Set(key, value)
 }

This preserves writer.FormDataContentType() and removes duplicates.

core/providers/vertex/vertex.go (1)

145-149: ListModels should be GET, not POST.

Using POST for a read-only list can 405/400 and is inconsistent with API semantics. Remove Content-Type too.

- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(requestURL)
- req.Header.SetContentType("application/json")
+ req.Header.SetMethod(http.MethodGet)
+ req.SetRequestURI(requestURL)
+ req.Header.Del("Content-Type")
🧹 Nitpick comments (11)
core/providers/openrouter.go (1)

218-218: LGTM! Good refactor to use dynamic provider keys.

Replacing hardcoded "openrouter" with string(provider.GetProviderKey()) across all unsupported operation errors improves maintainability and ensures consistency. This change follows the DRY principle and makes the code more resilient to future changes in provider identification.

Also applies to: 223-223, 228-228, 233-233, 238-238

Makefile (1)

174-338: CI detection logic is duplicated across test targets.

The same CI environment check [ -z "$$CI" ] && [ -z "$$GITHUB_ACTIONS" ] && ... is repeated in test, test-core, test-plugins, and test-all targets.

Consider extracting this to a shell function or Make variable to reduce duplication and make future updates easier.

core/providers/cerebras.go (1)

19-19: Consider updating the comment to reflect dual usage.

The comment states "HTTP client for API requests" but this client now handles both regular API requests and streaming operations after removing the separate streamClient.

Apply this diff to clarify the comment:

-	client              *fasthttp.Client      // HTTP client for API requests	
+	client              *fasthttp.Client      // HTTP client for API requests and streaming
core/providers/cohere/cohere.go (2)

47-49: Verify connection pool limits are intentional.

MaxConnsPerHost = 10000 is quite high and could lead to resource exhaustion under heavy load. While the PR indicates this is consistent with other providers, please confirm this value is appropriate for your production workload.


296-310: Remove redundant header setting and clean up empty comment.

Two issues found:

  1. Content-Type is set twice: line 298 uses SetContentType() and line 305 uses Set("Content-Type", ...). The second call overwrites the first, making line 298 redundant.
  2. Line 303 contains an empty comment line that should be removed.

Apply this diff:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
-	req.Header.SetContentType("application/json")
 
 	// Set any extra headers from network config
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
-	
+
 	// Set headers
 	req.Header.Set("Content-Type", "application/json")
core/providers/azure/azure.go (2)

35-41: Connection pool defaults are very aggressive; consider making them config-driven.

MaxConnsPerHost=10000, MaxIdleConnDuration=60s, MaxConnWaitTimeout=10s can exhaust FDs or create bursty load. Expose these via ProviderConfig (or a shared factory) so tests/prod can tune per environment.


151-154: Unnecessary Content-Type on GET.

For GET /models no body is sent; Content-Type can be omitted.

- req.Header.SetContentType("application/json")
+ // no Content-Type needed for GET without body
tests/core-providers/config/account.go (1)

16-16: Make test concurrency overridable via env.

Hard-coding 4 is fine for local, but CI/load may need tuning.

-const Concurrency = 4
+// Allows override via TEST_CONCURRENCY; falls back to 4
+var Concurrency = func() int {
+  if v := os.Getenv("TEST_CONCURRENCY"); v != "" {
+    if n, err := strconv.Atoi(v); err == nil && n > 0 {
+      return n
+    }
+  }
+  return 4
+}()

(Add: import strconv)

core/providers/vertex/vertex.go (1)

33-55: Remove stale vertexClientPool and related helpers (no longer populated).

Pool delete calls remain but nothing stores clients; rename to token cache or remove to avoid confusion.

core/providers/gemini/gemini.go (1)

377-383: Minor: duplicate Content-Type setting.

Header is set via SetContentType and again via Set("Content-Type"); keep one.

- req.Header.SetContentType("application/json")
- // Set headers for streaming
- req.Header.Set("Content-Type", "application/json")
+ // Set headers for streaming
+ req.Header.SetContentType("application/json")
core/providers/anthropic/anthropic.go (1)

463-466: Increase bufio.Scanner token limit for SSE.

Default 64KB can truncate long JSON lines; set a larger buffer (e.g., 10MB).

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

(Apply in both ChatCompletionStream helper and ResponsesStream.)

Also applies to: 719-724

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99d48da and af44c0a.

📒 Files selected for processing (20)
  • .gitignore (1 hunks)
  • Makefile (4 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/utils/utils.go
  • core/providers/parasail.go
  • scheams/go.mod
  • core/providers/bedrock/bedrock.go
🧰 Additional context used
🧬 Code graph analysis (8)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
core/providers/gemini/gemini.go (2)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/vertex/vertex.go (3)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/openai/openai.go (1)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
core/providers/anthropic/anthropic.go (1)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
🪛 checkmake (0.2.2)
Makefile

[warning] 174-174: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (17)
core/providers/openrouter.go (2)

92-92: LGTM!

The HandleProviderResponse call correctly passes resp.Body() as []byte, matching the expected signature.


137-137: LGTM! Streaming operations correctly unified to use fasthttp client.

All streaming methods (TextCompletionStream, ChatCompletionStream, ResponsesStream) now consistently use provider.client instead of a separate streamClient. This aligns with the PR's objective to consolidate HTTP operations under fasthttp and simplifies the provider architecture.

Also applies to: 172-172, 203-203

.gitignore (1)

38-39: LGTM!

The test-reports/ ignore entry correctly aligns with the new test reporting infrastructure introduced in the Makefile.

Makefile (2)

19-22: LGTM!

Test reporting configuration variables are well-chosen with sensible defaults.


63-83: LGTM!

The install targets properly handle CI environments and gracefully degrade when tools can't be installed, ensuring builds don't break.

core/providers/cerebras.go (3)

47-52: LGTM!

The constructor properly initializes the unified fasthttp client, successfully consolidating what were previously separate API and streaming clients.


88-88: LGTM!

The streaming methods correctly reference the unified provider.client instead of the removed streamClient, maintaining the same functionality while consolidating HTTP client usage.

Also applies to: 123-123


33-33: The MaxConnsPerHost value of 10,000 is consistent across all provider implementations. This was verified across all 13 provider files (Cerebras, OpenRouter, Mistral, OpenAI, Gemini, SGL, Parasail, Vertex, Anthropic, Cohere, Azure, Ollama, and Groq), confirming this is an intentional standardized configuration rather than an isolated issue.

However, the value of 10,000 concurrent connections per host is still exceptionally high and warrants careful consideration during architectural review to ensure the system can safely handle this capacity without resource exhaustion. If this was approved as part of the refactoring requirements, it should be documented and the resource implications understood.

core/providers/groq.go (2)

179-179: LGTM! Successful migration to unified fasthttp client.

The change to use provider.client instead of a separate streaming client successfully consolidates all HTTP operations to use fasthttp, aligning with the PR's objective to unify client usage across streaming and non-streaming operations.


33-35: The MaxConnsPerHost value is consistent across all providers and reflects an intentional design choice.

Verification shows that MaxConnsPerHost: 10000 (along with MaxIdleConnDuration: 60s and MaxConnWaitTimeout: 10s) is applied uniformly across all 13 provider files in the codebase, including groq.go, sgl.go, openai.go, anthropic.go, and others. This consistency indicates this is a deliberate, system-wide configuration for the fasthttp client implementation—not an anomaly. If there are concerns about connection resource management, those should be addressed at the system or configuration level rather than at the individual provider level.

core/providers/mistral/mistral.go (2)

33-33: No action required—this is a codebase-wide configuration decision, not a mistral-specific issue.

The MaxConnsPerHost value of 10000 is consistently applied across all 13 provider implementations as part of the coordinated refactor to a unified fasthttp client. This is an intentional architectural choice for the entire codebase, not a mistral-specific concern. While the value is higher than typical defaults, it represents a deliberate, uniform configuration strategy applied consistently across all providers.

Likely an incorrect or invalid review comment.


154-154: Verified: Streaming client change is correct and type-safe.

The change from streamClient to provider.client at line 154 is verified to be correct. MistralProvider's client field is typed as *fasthttp.Client, which matches the exact parameter type expected by HandleOpenAIChatCompletionStreaming. This change aligns with the PR's refactoring goal and is consistent with all other providers in the codebase (vertex, gemini, azure, groq, etc.), which already use this unified pattern.

core/providers/azure/azure.go (1)

275-287: LGTM: unified streaming via shared fasthttp client.

Consolidating to provider.client simplifies lifecycle and avoids dual clients.

core/providers/ollama.go (1)

94-106: LGTM: streaming now uses shared fasthttp client.

Removes redundant stream client and keeps a single code path.

Please run end-to-end streaming against Ollama to confirm SSE parsing with large chunks.

core/providers/sgl.go (2)

190-208: LGTM: unsupported error now uses provider key dynamically.

Improves consistency across providers.


101-113: LGTM: streaming via shared fasthttp client.

Consistent with the refactor goals.

core/providers/gemini/gemini.go (1)

420-429: LGTM: fixed response lifecycle for streaming.

Deferring ReleaseResponse in the goroutine avoids UAF with BodyStream.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
core/providers/vertex/vertex.go (4)

163-178: Incorrect error handling: using stale err instead of bifrostErr

On request failure you test the wrong variable; this will misclassify/collapse errors. Just return bifrostErr.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-     if errors.Is(err, context.Canceled) {
-         return nil, &schemas.BifrostError{ ... }
-     }
-     if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-         return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-     }
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+     return nil, bifrostErr
+ }

337-353: Same bug pattern: wrong variable in error handling

Use bifrostErr from MakeRequestWithContext; don't inspect earlier err.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-     if errors.Is(err, context.Canceled) { ... }
-     if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+     return nil, bifrostErr
+ }

516-519: Streaming auth for non-Claude should use OAuth token, not key.Value

Align with non-streaming path (TokenSource). Use Bearer token from getAuthTokenSource.

- authHeader := map[string]string{}
- if key.Value != "" {
-     authHeader["Authorization"] = "Bearer " + key.Value
- }
+ authHeader := map[string]string{}
+ tokenSource, err := getAuthTokenSource(key)
+ if err != nil {
+     return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+ }
+ token, err := tokenSource.Token()
+ if err != nil {
+     return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
+ }
+ authHeader["Authorization"] = "Bearer " + token.AccessToken

633-649: Same error-handling bug in embeddings

Return bifrostErr directly; current code tests unrelated err.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-     if errors.Is(err, context.Canceled) { ... }
-     if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+     return nil, bifrostErr
+ }
core/providers/anthropic/anthropic.go (1)

676-693: Critical: Fix undefined/wrong error variable in ResponsesStream.

The error handling at lines 679, 689, and 692 references err, but this variable holds the stale error from sonic.Marshal at line 648, not the request error bifrostErr returned by MakeRequestWithContext at line 676. Additionally, MakeRequestWithContext already handles context cancellation and timeout errors internally (see core/providers/utils/utils.go lines 35-90), making these redundant checks unnecessary.

The past review comment flagged this same issue and was marked as addressed in commit 99d48da, but the fix was not applied to ResponsesStream.

Apply this diff to fix the error handling:

  _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
  if bifrostErr != nil {
    defer fasthttp.ReleaseResponse(resp)
-   if errors.Is(err, context.Canceled) {
-     return nil, &schemas.BifrostError{
-       IsBifrostError: false,
-       Error: &schemas.ErrorField{
-         Type:    schemas.Ptr(schemas.RequestCancelled),
-         Message: schemas.ErrRequestCancelled,
-         Error:   err,
-       },
-     }
-   }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-     return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-   }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+   return nil, bifrostErr
  }
♻️ Duplicate comments (11)
core/providers/openai/openai.go (7)

355-356: Increase bufio.Scanner max token size for SSE

Default 64KB truncates long JSON lines. Bump buffer.

-		scanner := bufio.NewScanner(resp.BodyStream())
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 64*1024)
+		scanner.Buffer(buf, 10*1024*1024)

713-714: Increase Scanner buffer (chat streaming)

Prevent “token too long” on large chunks.

-		scanner := bufio.NewScanner(resp.BodyStream())
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 64*1024)
+		scanner.Buffer(buf, 10*1024*1024)

1073-1074: Increase Scanner buffer (responses streaming)

Same fix as other streaming handlers.

-		scanner := bufio.NewScanner(resp.BodyStream())
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 64*1024)
+		scanner.Buffer(buf, 10*1024*1024)

1433-1434: Increase Scanner buffer (speech streaming)

Same adjustment.

-		scanner := bufio.NewScanner(resp.BodyStream())
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 64*1024)
+		scanner.Buffer(buf, 10*1024*1024)

1703-1704: Increase Scanner buffer (transcription streaming)

Same adjustment.

-		scanner := bufio.NewScanner(resp.BodyStream())
+		scanner := bufio.NewScanner(resp.BodyStream())
+		buf := make([]byte, 0, 64*1024)
+		scanner.Buffer(buf, 10*1024*1024)

1653-1665: Fix duplicate headers and wrong content-type in TranscriptionStream

Avoid setting JSON content-type for multipart; remove duplicate SetExtraHeaders.

-	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
-	req.Header.SetMethod(http.MethodPost)
-	req.SetRequestURI(url)
-	req.Header.SetContentType("application/json")
-
-	// Set any extra headers from network config
-	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+	req.Header.SetMethod(http.MethodPost)
+	req.SetRequestURI(url)

40-43: Lower MaxConnsPerHost from 10000

Adopt ~1024–2048 or config-driven; same rationale as other providers.

core/providers/parasail.go (1)

33-36: MaxConnsPerHost=10000 is excessive; lower or make configurable

This can exhaust FDs/memory and even degrade perf above a few thousand connex. Prefer 1000–2048 or a config-driven value with doc on tradeoffs. Apply consistently across providers.

core/providers/vertex/vertex.go (2)

72-75: MaxConnsPerHost too high

Reduce to ~1024–2048 or make configurable; matches concerns raised in other providers.


145-148: ListModels should use GET, not POST

Models listing is read-only; use GET and drop JSON content-type.

- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(requestURL)
- req.Header.SetContentType("application/json")
+ req.Header.SetMethod(http.MethodGet)
+ req.SetRequestURI(requestURL)
core/providers/azure/azure.go (1)

39-41: Reduce MaxConnsPerHost from 10000

Adopt a safer default or make configurable; consistent with other providers.

🧹 Nitpick comments (7)
core/providers/cohere/cohere.go (1)

47-47: Verify the MaxConnsPerHost value.

The value of 10,000 concurrent connections per host is unusually high. Typical production configurations use 100-1,000. While this might be intentional for high-throughput scenarios, such a high limit could lead to resource exhaustion or performance issues under certain conditions.

Consider whether a lower value (e.g., 1000) would be more appropriate, or document why such a high limit is necessary for this use case.

tests/core-providers/config/account.go (1)

16-16: Make test concurrency adjustable via env for flexibility

Hard-coding 4 is fine, but allowing TEST_CONCURRENCY to override helps local stress/CI tuning without code edits. Example: read os.Getenv("TEST_CONCURRENCY") with strconv.Atoi fallback to 4.

Makefile (4)

19-23: TEST_REPORT_FORMAT is unused

Either wire it to control HTML generation (vs XML-only) or drop it to avoid confusion.


63-66: Use portable command detection

Prefer command -v over which for portability and fewer subshells.

-install-gotestsum: ## Install gotestsum for test reporting (if not already installed)
-	@which gotestsum > /dev/null || (echo "$(YELLOW)Installing gotestsum for test reporting...$(NC)" && go install gotest.tools/gotestsum@latest)
+install-gotestsum: ## Install gotestsum for test reporting (if not already installed)
+	@command -v gotestsum >/dev/null 2>&1 || (echo "$(YELLOW)Installing gotestsum for test reporting...$(NC)" && go install gotest.tools/gotestsum@latest)

68-83: Prefer command -v and add npx fallback for junit-viewer

Improves portability and avoids global npm install requirement.

-install-junit-viewer: ## Install junit-viewer for HTML report generation (if not already installed)
-	@if [ -z "$$CI" ] && [ -z "$$GITHUB_ACTIONS" ] && [ -z "$$GITLAB_CI" ] && [ -z "$$CIRCLECI" ] && [ -z "$$JENKINS_HOME" ]; then \
-		if which junit-viewer > /dev/null 2>&1; then \
+install-junit-viewer: ## Install junit-viewer for HTML report generation (if not already installed)
+	@if [ -z "$$CI" ] && [ -z "$$GITHUB_ACTIONS" ] && [ -z "$$GITLAB_CI" ] && [ -z "$$CIRCLECI" ] && [ -z "$$JENKINS_HOME" ]; then \
+		if command -v junit-viewer >/dev/null 2>&1; then \
 			echo "$(GREEN)junit-viewer is already installed$(NC)"; \
 		else \
 			echo "$(YELLOW)Installing junit-viewer for HTML reports...$(NC)"; \
-			if npm install -g junit-viewer 2>&1; then \
+			if npm install -g junit-viewer 2>&1 || npx --yes junit-viewer --version >/dev/null 2>&1; then \
 				echo "$(GREEN)junit-viewer installed successfully$(NC)"; \
 			else \
 				echo "$(RED)Failed to install junit-viewer. HTML reports will be skipped.$(NC)"; \
-				echo "$(YELLOW)You can install it manually: npm install -g junit-viewer$(NC)"; \
+				echo "$(YELLOW)You can install it manually: npm install -g junit-viewer (or use npx junit-viewer)$(NC)"; \
 				exit 0; \
 			fi; \
 		fi \
 	else \
 		echo "$(YELLOW)CI environment detected, skipping junit-viewer installation$(NC)"; \
 	fi

174-201: Target too long; centralize HTML conversion to a helper

checkmake flags test body length (26). Move HTML generation into generate-html-reports and call it here based on TEST_REPORT_FORMAT and non-CI. Also puts TEST_REPORT_FORMAT to use.

core/providers/vertex/vertex.go (1)

33-54: Remove stale client pool or repurpose

vertexClientPool/removeVertexClient are vestiges of http.Client caching and are now unused with TokenSource + single fasthttp.Client. Consider removing to reduce confusion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99d48da and af44c0a.

📒 Files selected for processing (20)
  • .gitignore (1 hunks)
  • Makefile (4 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
✅ Files skipped from review due to trivial changes (1)
  • core/providers/bedrock/bedrock.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • scheams/go.mod
  • core/providers/cerebras.go
  • core/providers/bedrock/signer.go
  • core/providers/groq.go
  • core/providers/utils/utils.go
  • .gitignore
🧰 Additional context used
🧬 Code graph analysis (7)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/anthropic/anthropic.go (1)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
core/providers/gemini/gemini.go (2)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/sgl.go (3)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
core/schemas/provider.go (1)
  • PostHookRunner (194-194)
core/schemas/bifrost.go (1)
  • BifrostError (322-331)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/openai/openai.go (1)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
🪛 checkmake (0.2.2)
Makefile

[warning] 174-174: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (13)
core/providers/sgl.go (3)

56-56: LGTM! Unified client approach.

The removal of the separate streamClient field and use of a single fasthttp.Client for all operations aligns with the PR objectives and simplifies the provider architecture.


103-103: LGTM! Consistent streaming client usage.

Both streaming methods (TextCompletionStream and ChatCompletionStream) now correctly use provider.client instead of a separate stream client, which is consistent with the unified fasthttp approach.

Also applies to: 138-138


192-192: LGTM! Improved error message consistency.

Using string(provider.GetProviderKey()) instead of the hardcoded "sgl" string improves maintainability and ensures error messages remain consistent with the provider's actual key.

Also applies to: 197-197, 202-202, 207-207

core/providers/ollama.go (3)

93-106: LGTM! Unified client for streaming.

The change correctly uses the unified provider.client for streaming operations, eliminating the separate streamClient. The refactoring aligns with the PR objective of consolidating to fasthttp for all operations.


127-141: LGTM! Consistent streaming refactoring.

The streaming implementation correctly uses the unified provider.client, maintaining consistency with TextCompletionStream and the broader PR migration pattern.


31-37: Standardized configuration across all providers—verify architectural intent.

The configuration values in core/providers/ollama.go are not unique to this file. The script results show that MaxConnsPerHost: 10000 and the timeout configuration using DefaultRequestTimeoutInSeconds (30 seconds) are standardized across all 13 provider files in this PR.

This suggests an intentional system-wide refactoring rather than isolated changes. The fasthttp default is 512, so the 10-fold increase warrants clarification:

  1. MaxConnsPerHost: 10000 is substantially higher than typical production configurations shown in fasthttp guidance. Verify this standardization is intentional and appropriate for your deployment environment's resource constraints (file descriptors, memory).

  2. Streaming timeout (30s) affects all providers equally. Confirm this timeout is suitable for long-lived streaming operations across all providers using this unified client.

core/providers/cohere/cohere.go (3)

30-36: LGTM: Successful consolidation to single fasthttp client.

The removal of the separate streamClient field in favor of a single client *fasthttp.Client aligns with the PR objective and simplifies the provider's client management.


313-346: LGTM: Correct fasthttp streaming implementation.

The streaming implementation properly handles:

  • fasthttp-specific error types (fasthttp.ErrTimeout)
  • Resource management with appropriate ReleaseResponse calls in error paths and the streaming goroutine
  • Response body streaming with resp.BodyStream() and resp.StreamBody = true

581-614: LGTM: Consistent streaming implementation.

The ResponsesStream implementation correctly mirrors the ChatCompletionStream pattern with proper fasthttp error handling and resource management.

core/providers/mistral/mistral.go (1)

154-154: LGTM - Unified client with proper streaming timeout handling.

Verification confirms HandleOpenAIChatCompletionStreaming correctly handles streaming: ReadTimeout/WriteTimeout apply to the initial handshake, not body streaming, and context cancellation is explicitly checked before processing each chunk in the scanning loop. The implementation is sound.

core/providers/parasail.go (1)

111-111: Unified streaming client LGTM

Switching to provider.client for streaming aligns with the refactor.

core/providers/gemini/gemini.go (1)

367-373: Streaming lifecycle fix LGTM

Releasing fasthttp.Response inside the goroutine avoids UAF on BodyStream.

core/providers/anthropic/anthropic.go (1)

78-83: Verify the MaxConnsPerHost value of 10000.

The value of 10,000 is unusually high for MaxConnsPerHost. Typical values range from 100 to 1,000. While this may be intentional for high-throughput scenarios, please confirm that this value is appropriate for your expected load and won't cause resource exhaustion.

@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from df11c92 to 22dbf88 Compare October 31, 2025 00:44
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from af44c0a to bb3b48d Compare October 31, 2025 00:44
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/providers/vertex/vertex.go (1)

162-178: Fix error handling: we’re inspecting the wrong variable

When providerUtils.MakeRequestWithContext returns a bifrostErr, the code checks errors.Is(err, …) where err is still bound to the earlier token-source call. That’s always nil, so we drop the real failure (context cancel, timeout, etc.) and end up fabricating a generic ErrProviderRequest with no root cause. Please return bifrostErr (or interrogate bifrostErr.Error) instead of the stale err. Same issue repeats in the chat and embedding paths below—please fix there as well.

♻️ Duplicate comments (11)
core/providers/bedrock/signer.go (3)

193-197: Content-Length signing can drift from actual header.

This issue was previously flagged: You include "content-length" in signed headers from len(body), but the actual header is set by fasthttp when the body is attached (or may be absent for GET). If the request body changes after signing, signatures can break.

Only include content-length if it's already present on req.Header. Check req.Header.ContentLength() >= 0 and use the actual value from the request header.

Apply this diff:

-	// Include content-length if body is present
-	if len(body) > 0 {
-		headerNames = append(headerNames, "content-length")
-		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
-	}
+	// Include content-length if already set on request
+	if cl := req.Header.ContentLength(); cl >= 0 {
+		headerNames = append(headerNames, "content-length")
+		headerMap["content-length"] = []string{strconv.Itoa(cl)}
+	}

200-217: fasthttp header iteration: req.Header.All() doesn't exist (compile-time error).

This issue was previously flagged: req.Header.All() is not a valid fasthttp method and will cause a compilation failure. Use req.Header.VisitAll instead.

Apply this diff:

-	// Collect other headers
-	for key, value := range req.Header.All() {
-		keyStr := strings.ToLower(string(key))
-
-		// Skip ignored headers
-		if _, ignore := ignoredHeaders[keyStr]; ignore {
-			continue
-		}
-
-		// Skip if already handled
-		if keyStr == "host" || keyStr == "content-length" {
-			continue
-		}
-
-		if _, exists := headerMap[keyStr]; !exists {
-			headerNames = append(headerNames, keyStr)
-		}
-		headerMap[keyStr] = append(headerMap[keyStr], string(value))
-	}
+	// Collect other headers
+	req.Header.VisitAll(func(k, v []byte) {
+		keyStr := strings.ToLower(string(k))
+		// Skip ignored headers
+		if _, ignore := ignoredHeaders[keyStr]; ignore {
+			return
+		}
+		// Skip if already handled
+		if keyStr == "host" || keyStr == "content-length" {
+			return
+		}
+		if _, exists := headerMap[keyStr]; !exists {
+			headerNames = append(headerNames, keyStr)
+		}
+		headerMap[keyStr] = append(headerMap[keyStr], string(v))
+	})

242-259: Canonical query string violates AWS SigV4 RFC 3986 encoding.

This issue was previously flagged: The current implementation uses url.ParseQuery() and values.Encode(), which apply x-www-form-urlencoded encoding (space as '+'), then attempts to patch with ReplaceAll("+", "%20"). This is fragile and breaks on edge cases—a literal '+' in a parameter value gets decoded to space, re-encoded as '+', then corrupted to '%20'.

AWS SigV4 requires RFC 3986 percent-encoding: space as %20, literal '+' as %2B, unreserved chars (A–Z a–z 0–9 - _ . ~) unencoded.

Rewrite with manual RFC 3986 encoding: split the query string on '&', percent-encode each name/value pair per RFC 3986, sort by encoded key (and by encoded value for duplicate names), then join with '&'. Add tests using AWS SigV4 canonicalization test vectors to validate.

core/providers/openrouter.go (1)

35-37: MaxConnsPerHost=10000 remains excessively high.

Past reviews identified this value as a resource exhaustion risk. The setting is duplicated across 13 providers. Consider reducing to 1000-2048 or making it configurable via ProviderConfig.

core/providers/sgl.go (1)

34-36: MaxConnsPerHost=10000 remains excessively high.

This value was specifically flagged in past reviews for this file. Consider reducing to 500-1000 or making it configurable.

core/providers/anthropic/anthropic.go (1)

78-82: MaxConnsPerHost=10000 is excessively high.

This value appears uniformly across all providers and was flagged in past reviews. Consider reducing to 1000-2048 or making it configurable.

core/providers/ollama.go (1)

34-36: MaxConnsPerHost=10000 is excessively high.

This value was flagged across all providers in past reviews. Consider reducing to 1000-2048 or making it configurable.

core/providers/mistral/mistral.go (1)

33-35: MaxConnsPerHost=10000 is excessively high.

Past reviews specifically flagged this file with detailed analysis, noting that fasthttp guidance suggests values of 1000 or more only for peak load with verified resources. This uniform 10000 setting across all 13 providers risks memory exhaustion and performance degradation. Consider reducing to 1000 or making it configurable via ProviderConfig.

core/providers/vertex/vertex.go (1)

517-519: Streaming auth still bypasses OAuth

Non-Claude streaming requests still send Authorization: Bearer using key.Value, but the Vertex OpenAI-compatible endpoints only accept short-lived OAuth access tokens (for example, gcloud auth print-access-token).(docs.cloud.google.com)

core/providers/openai/openai.go (2)

355-355: bufio.Scanner 64KB token limit risks truncated/failed SSE processing.

Long SSE lines (JSON chunks) can exceed 64KB, causing "token too long" errors. This applies to all streaming handlers in this file.

Apply a larger buffer after creating the scanner at lines 355, 713, 1073, 1433, and 1703:

  scanner := bufio.NewScanner(resp.BodyStream())
+ // Raise max token size to 10MB to accommodate long SSE lines
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

Alternatively, replace scanner with bufio.Reader and ReadBytes('\n') for more control.


1653-1667: TranscriptionStream: Duplicate SetExtraHeaders and incorrect Content-Type overwrite.

Line 1657 sets Content-Type to "application/json", overwriting the multipart content type. Line 1660 duplicates the SetExtraHeaders call from line 1653. Though the final header loop restores the correct multipart type, the intermediate overwrite is incorrect and confusing.

Apply this diff to fix:

+ // Set method/URI first
+ req.Header.SetMethod(http.MethodPost)
+ req.SetRequestURI(url)
+ 
  // Set any extra headers from network config
  providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
- 
- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(url)
- req.Header.SetContentType("application/json")
- 
- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
  
  // Set headers
  for key, value := range headers {
      req.Header.Set(key, value)
  }

This preserves the multipart Content-Type from headers and eliminates duplication.

🧹 Nitpick comments (5)
tests/core-providers/config/account.go (1)

16-16: Excellent refactor - centralized test concurrency control.

Introducing the Concurrency constant eliminates duplication and provides a single tuning point for test parallelism. The value of 4 is reasonable for test stability while maintaining adequate concurrency coverage.

core/providers/openrouter.go (1)

218-218: Improved error messages with dynamic provider key.

Replacing hardcoded "openrouter" with string(provider.GetProviderKey()) ensures error messages reflect the actual provider identity, including custom provider configurations.

Also applies to: 223-223, 228-228, 233-233, 238-238

core/providers/sgl.go (1)

192-192: Good improvement - dynamic provider key in error messages.

Error messages now use string(provider.GetProviderKey()) instead of the hardcoded "sgl" string.

Also applies to: 197-197, 202-202, 207-207

Makefile (2)

62-172: Excellent test infrastructure additions.

The new test reporting targets provide:

  • Automated tool installation (gotestsum, junit-viewer)
  • Conditional HTML report generation (local dev only)
  • Clear error messages and usage guidance
  • Proper CI environment detection

173-337: Test targets significantly enhanced with filtering and reporting.

The updated test targets provide:

  • Per-provider and per-testcase test execution (PROVIDER=anthropic TESTCASE=SimpleChat)
  • Automatic JUnit XML report generation
  • HTML reports for local development
  • Comprehensive test-all summary with index

The static analysis warning about body length is a false positive—the complexity is justified for comprehensive test orchestration.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af44c0a and bb3b48d.

📒 Files selected for processing (20)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/utils/utils.go
  • core/providers/groq.go
  • scheams/go.mod
  • core/providers/cerebras.go
🧰 Additional context used
🧬 Code graph analysis (9)
core/providers/bedrock/bedrock.go (2)
core/schemas/bifrost.go (1)
  • RequestCancelled (282-282)
core/schemas/provider.go (1)
  • ErrRequestCancelled (23-23)
core/providers/gemini/gemini.go (2)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/vertex/vertex.go (3)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/openai/openai.go (1)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • HandleProviderAPIError (207-227)
core/providers/anthropic/anthropic.go (3)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/anthropic/types.go (1)
  • AnthropicError (315-321)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (17)
core/providers/bedrock/signer.go (4)

1-54: LGTM: Clean setup for AWS SigV4 signing.

The package structure, imports, constants, and caching mechanism are well-designed. The signing key cache with double-check locking is a good optimization for repeated requests.


56-125: LGTM: Helper functions correctly implement SigV4 spec.

The HMAC, key derivation, and space-stripping utilities follow AWS Signature Version 4 requirements. The caching logic with double-check locking prevents race conditions while optimizing performance.


127-183: LGTM: Credential loading and request setup are correct.

The AWS credential resolution (with fallback to default config), time formatting, path escaping via httpbinding.EscapePath, payload hashing, and required header setup all follow AWS SigV4 requirements correctly.


261-304: LGTM: Signature calculation and authorization header are correct.

The canonical request construction, credential scope, string-to-sign derivation, and final HMAC-SHA256 signature calculation all follow the AWS Signature Version 4 specification correctly. The Authorization header format matches AWS requirements.

core/providers/bedrock/bedrock.go (1)

257-266: LGTM: Formatting adjustments with no functional changes.

The indentation and spacing changes in the error handling path are purely cosmetic and maintain the existing error propagation logic for cancelled requests.

.gitignore (1)

38-40: LGTM - proper test artifact exclusion.

The test-reports/ directory exclusion aligns with the new test reporting infrastructure added in the Makefile.

core/providers/openrouter.go (1)

137-137: Streaming correctly migrated to shared fasthttp client.

The streaming methods now use provider.client instead of a dedicated stream client, successfully unifying HTTP operations under fasthttp.

Also applies to: 172-172, 203-203

core/providers/sgl.go (1)

103-103: LGTM - streaming unified under fasthttp client.

Streaming methods correctly use provider.client instead of a separate stream client.

Also applies to: 138-138

core/providers/anthropic/anthropic.go (3)

116-123: LGTM - error parser correctly migrated to fasthttp.

The parseStreamAnthropicError function properly handles fasthttp.Response with appropriate status code and body access patterns.


386-578: Streaming implementation correctly migrated to fasthttp.

The chat completion streaming properly:

  • Acquires/releases fasthttp resources
  • Defers response release inside the streaming goroutine (avoids use-after-free)
  • Uses resp.BodyStream() for SSE parsing
  • Handles errors with fasthttp-specific types

Past critical issues were addressed in commit 99d48da.


627-827: ResponsesStream properly implements fasthttp streaming.

The responses streaming follows the same correct patterns as chat streaming, with appropriate resource management and error handling.

core/providers/ollama.go (1)

96-96: LGTM - streaming unified under fasthttp client.

Both streaming methods correctly use provider.client following the migration pattern.

Also applies to: 131-131

core/providers/mistral/mistral.go (1)

154-154: LGTM - streaming correctly migrated to fasthttp client.

The streaming method properly uses provider.client following the unified client pattern.

core/providers/openai/openai.go (4)

38-43: LGTM: Appropriate fasthttp client configuration for streaming.

The timeout and connection pool settings are well-suited for high-concurrency streaming operations.


366-370: LGTM: Context cancellation checks improve streaming responsiveness.

The context checks at lines 366-370, 725-729, 1080-1084, 1441-1445, and 1711-1715 allow streaming handlers to exit promptly when the client cancels the request.


336-336: LGTM: Consistent fasthttp error handling across all streaming handlers.

The error handling at lines 323-340, 680-697, 1040-1057, 1400-1417, and 1670-1687 correctly checks for fasthttp.ErrTimeout and other error conditions with appropriate error wrapping.


303-306: LGTM: Proper fasthttp request/response lifecycle management.

All streaming handlers correctly acquire/release requests and responses, with proper deferred cleanup in both success and error paths (lines 303-306, 353, 660-663, 711, 1020-1023, 1071, 1370-1374, 1431, 1645-1648, 1701).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
core/providers/anthropic/anthropic.go (1)

676-693: Undefined variable err referenced after MakeRequestWithContext.

Lines 679 and 689 check errors.Is(err, ...) but MakeRequestWithContext at line 676 assigns to bifrostErr, not err. This will fail to compile.

Apply this diff:

 	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
 		defer fasthttp.ReleaseResponse(resp)
-		if errors.Is(err, context.Canceled) {
+		if errors.Is(bifrostErr.Error.Error, context.Canceled) {
 			return nil, &schemas.BifrostError{
 				IsBifrostError: false,
 				Error: &schemas.ErrorField{
 					Type:    schemas.Ptr(schemas.RequestCancelled),
 					Message: schemas.ErrRequestCancelled,
-					Error:   err,
+					Error:   bifrostErr.Error.Error,
 				},
 			}
 		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
+		return nil, bifrostErr
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
 	}

Actually, the simplest fix is to return bifrostErr directly since MakeRequestWithContext already wraps timeout/cancel errors appropriately:

 	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
 		defer fasthttp.ReleaseResponse(resp)
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}
core/providers/vertex/vertex.go (3)

162-178: Undefined variable err referenced after MakeRequestWithContext.

Lines 164, 174 check errors.Is(err, ...) but line 162 assigns to bifrostErr, not err. This causes a compile error.

Apply this diff:

 	latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
+		return nil, bifrostErr
 	}

MakeRequestWithContext already handles timeout and cancellation errors, so simply return bifrostErr.


337-353: Undefined variable err in ChatCompletion error handling.

Lines 339, 349 reference undefined err after line 337 assigns to bifrostErr.

Apply this diff:

 	latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}

633-649: Undefined variable err in handleVertexEmbedding error handling.

Lines 635, 645 reference undefined err after line 633 assigns to bifrostErr.

Apply this diff:

 	latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}
-
-	
♻️ Duplicate comments (13)
core/providers/openrouter.go (1)

35-37: MaxConnsPerHost value already flagged in previous reviews.

This concern about the excessive MaxConnsPerHost: 10000 value has already been raised in prior review comments, which recommend reducing it to a safer default (1024-2048) or making it configurable across all providers.

core/providers/bedrock/signer.go (3)

200-217: Critical: fasthttp.RequestHeader.All() doesn't exist - compilation will fail.

This code references req.Header.All() which is not a valid method on fasthttp.RequestHeader. This is a compile-time error that will prevent the code from building.

Replace with fasthttp.RequestHeader.VisitAll per the previous review comment:

-    // Collect other headers
-    for key, value := range req.Header.All() {
-        keyStr := strings.ToLower(string(key))
-
-        // Skip ignored headers
-        if _, ignore := ignoredHeaders[keyStr]; ignore {
-            continue
-        }
-
-        // Skip if already handled
-        if keyStr == "host" || keyStr == "content-length" {
-            continue
-        }
-
-        if _, exists := headerMap[keyStr]; !exists {
-            headerNames = append(headerNames, keyStr)
-        }
-        headerMap[keyStr] = append(headerMap[keyStr], string(value))
-    }
+    // Collect other headers
+    req.Header.VisitAll(func(k, v []byte) {
+        keyStr := strings.ToLower(string(k))
+        // Skip ignored headers
+        if _, ignore := ignoredHeaders[keyStr]; ignore {
+            return
+        }
+        // Skip if already handled
+        if keyStr == "host" || keyStr == "content-length" {
+            return
+        }
+        if _, exists := headerMap[keyStr]; !exists {
+            headerNames = append(headerNames, keyStr)
+        }
+        headerMap[keyStr] = append(headerMap[keyStr], string(v))
+    })

193-197: Content-Length header may drift from signed value, breaking signatures.

The code signs content-length using len(body), but fasthttp may set the actual Content-Length header to a different value when the request is sent (or omit it for certain request types). This mismatch between signed and actual header values will cause AWS to reject the request with a signature error.

Only include content-length in signed headers if fasthttp has already set it:

-    // Include content-length if body is present
-    if len(body) > 0 {
-        headerNames = append(headerNames, "content-length")
-        headerMap["content-length"] = []string{strconv.Itoa(len(body))}
-    }
+    // Include content-length if already set by fasthttp
+    if cl := req.Header.ContentLength(); cl >= 0 {
+        headerNames = append(headerNames, "content-length")
+        headerMap["content-length"] = []string{strconv.Itoa(cl)}
+    }

242-259: Canonical query string encoding violates AWS SigV4 spec and corrupts data.

The current implementation conflates x-www-form-urlencoded encoding (space as '+') with RFC 3986 encoding (space as '%20'). url.ParseQuery() decodes '+' as space; values.Encode() re-encodes space as '+'; ReplaceAll("+", "%20") attempts to patch this but corrupts literal '+' characters in parameter values. For example, a parameter with value "a+b" gets decoded to "a b", re-encoded as "a+b", then incorrectly transformed to "a%20b".

AWS SigV4 requires RFC 3986 percent-encoding where each query parameter name and value are encoded separately (space as %20, unreserved chars A–Z a–z 0–9 - _ . ~ preserved, everything else as %HH with uppercase hex), then sorted lexicographically by encoded name.

Replace with manual RFC 3986 encoding: split the query string on '&', for each pair split on first '=' to get name/value, percent-encode both per RFC 3986, sort by encoded name (and by encoded value for duplicate names), join with '&'. Add unit tests using AWS SigV4 canonicalization test vectors to validate correct handling of spaces, literal '+', and other special characters.

core/providers/cohere/cohere.go (2)

296-310: Remove duplicate Content-Type header setting.

The Content-Type header is set twice: once at line 298 using SetContentType and again at line 305 using Set. The second call is redundant.

Apply this diff to remove the duplicate:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 
 	// Set any extra headers from network config
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 	
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)
 	req.Header.Set("Accept", "text/event-stream")
 	req.Header.Set("Cache-Control", "no-cache")

566-577: Remove duplicate Content-Type header setting.

Similar to the issue in ChatCompletionStream, the Content-Type header is set twice: once at line 568 using SetContentType and again at line 572 using Set. The second call is redundant.

Apply this diff to remove the duplicate:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 		
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)
 	req.Header.Set("Accept", "text/event-stream")
 	req.Header.Set("Cache-Control", "no-cache")
core/providers/sgl.go (1)

34-36: MaxConnsPerHost=10000 is excessively high; reduce to 500-1000.

Setting MaxConnsPerHost to 10,000 risks resource exhaustion. Production values typically range from 100-500, rarely exceeding 1,000.

Apply this diff:

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

Make this configurable via config.NetworkConfig so operators can tune it per deployment.

core/providers/gemini/gemini.go (1)

37-39: MaxConnsPerHost=10000 is excessively high; reduce to 500-1000.

A limit of 10,000 connections per host can overwhelm both client and server. Production systems typically use 100-500, with high-throughput scenarios rarely exceeding 1,000.

Apply this diff:

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

Make this tunable via config.NetworkConfig for per-deployment optimization.

core/providers/anthropic/anthropic.go (1)

78-82: MaxConnsPerHost=10000 is excessively high; reduce to 500-1000.

A limit of 10,000 connections per host can cause resource exhaustion. Typical production ranges are 100-500, with high-load systems rarely exceeding 1,000.

Apply this diff:

-		MaxConnsPerHost:     10000,		
+		MaxConnsPerHost:     500,

Consider exposing this as a tunable parameter in config.NetworkConfig.

core/providers/vertex/vertex.go (2)

69-75: MaxConnsPerHost=10000 is excessively high; reduce to 500-1000.

Setting 10,000 connections per host can exhaust resources. Production deployments typically use 100-500, rarely exceeding 1,000.

Apply this diff:

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

Make this configurable via config.NetworkConfig for per-deployment tuning.


145-148: ListModels should use GET, not POST.

Line 145 sets the HTTP method to POST for a models list endpoint, which is a read operation. This will likely result in 405 Method Not Allowed or 400 Bad Request.

Apply this diff:

-	req.Header.SetMethod(http.MethodPost)
+	req.Header.SetMethod(http.MethodGet)
 	req.SetRequestURI(requestURL)
-	req.Header.SetContentType("application/json")

Remove the JSON Content-Type header since there's no request body for a GET request.

core/providers/openai/openai.go (2)

38-42: MaxConnsPerHost=10000 is excessively high; reduce to 500-1000.

A value of 10,000 connections per host can overwhelm resources on both client and server. Production systems typically use 100-500, with high-throughput scenarios rarely exceeding 1,000.

Apply this diff:

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

Consider making this configurable via config.NetworkConfig for deployment-specific tuning.


1653-1667: Duplicate SetExtraHeaders and overwriting multipart Content-Type.

Lines 1653, 1660 call SetExtraHeaders twice, and line 1657 sets Content-Type to application/json, overwriting the multipart content type from writer.FormDataContentType() set in the headers map at line 1638.

Apply this diff:

-	// Set any extra headers from network config
-	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
-	req.Header.SetContentType("application/json")
-
-	// Set any extra headers from network config
+
+	// Set any extra headers from network config
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 	// Set headers
 	for key, value := range headers {
 		req.Header.Set(key, value)
 	}

This removes the duplicate SetExtraHeaders call and the incorrect SetContentType that was overwriting the multipart boundary.

🧹 Nitpick comments (7)
Makefile (1)

201-287: Rich test-core functionality, but complexity warrants attention.

The refactored test-core target adds comprehensive per-provider and per-testcase reporting capabilities with HTML generation. However, the nested conditionals and repeated logic patterns (especially the CI environment checks and junit-viewer invocations) make this target difficult to maintain and test. Once the path issues above are fixed, consider extracting repeated shell logic into helper functions or dedicated shell scripts to improve clarity.

After fixing the path issues, test the HTML generation locally in both non-CI and CI environments to confirm correct behavior.

core/providers/ollama.go (1)

39-42: Remove commented-out code.

The commented-out response pool pre-warming code should either be uncommented if needed or removed entirely. Dead code reduces readability and can cause confusion. If this functionality is planned for future implementation, consider adding a TODO comment explaining why it's deferred.

-	// // Pre-warm response pools
-	// for range config.ConcurrencyAndBufferSize.Concurrency {
-	// 	ollamaResponsePool.Put(&schemas.BifrostResponse{})
-	// }
-
core/providers/bedrock/signer.go (1)

244-244: Consider handling url.ParseQuery errors for debugging.

The error from url.ParseQuery is silently ignored. While malformed query strings are rare, logging or propagating this error would aid debugging signature mismatches.

-        values, _ := url.ParseQuery(queryString)
+        values, err := url.ParseQuery(queryString)
+        if err != nil {
+            return providerUtils.NewBifrostOperationError("failed to parse query string", err, providerName)
+        }

Note: This suggestion will become moot once the canonical query string logic is rewritten to use RFC 3986 encoding as recommended in the previous comment.

core/providers/parasail.go (1)

38-38: Remove extraneous empty line.

Line 38 is an empty line with trailing whitespace or unnecessary spacing.

Apply this diff:

-	
+
core/providers/sgl.go (1)

56-56: Remove trailing whitespace.

Line 56 has trailing tabs or spaces after the client assignment.

Apply this diff:

-		client:              client,		
+		client:              client,
core/providers/vertex/vertex.go (2)

628-631: Remove duplicate comment and empty lines.

Lines 628-631 contain duplicate "Set any extra headers" comments and extraneous blank lines.

Apply this diff:

 	req.SetBody(jsonBody)
-
-	
-	// Set any extra headers from network config
-	
-

651-651: Remove extraneous empty line.

Apply this diff:

-	
 	if resp.StatusCode() != fasthttp.StatusOK {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af44c0a and bb3b48d.

📒 Files selected for processing (20)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
✅ Files skipped from review due to trivial changes (2)
  • core/providers/utils/utils.go
  • core/providers/bedrock/bedrock.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • core/providers/groq.go
  • core/providers/azure/azure.go
  • core/providers/mistral/mistral.go
  • scheams/go.mod
  • tests/core-providers/config/account.go
  • core/providers/cerebras.go
🧰 Additional context used
🧬 Code graph analysis (8)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/gemini/gemini.go (2)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/anthropic/anthropic.go (2)
core/providers/anthropic/types.go (1)
  • AnthropicError (315-321)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/anthropic/anthropic.go (1)
  • HandleAnthropicChatCompletionStreaming (385-578)
core/providers/openai/openai.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (10)
.gitignore (1)

38-40: Ignore entry for test reporting is well-placed.

The addition of test-reports/ aligns with the test reporting infrastructure introduced in the Makefile changes.

Makefile (3)

37-43: Help text properly documents test reporting configuration.

The new documentation for TEST_REPORTS_DIR and GOTESTSUM_FORMAT is clear and helpful for users discovering the test reporting capabilities.


62-65: Installation targets have appropriate fallback logic.

The install-gotestsum target is straightforward. The install-junit-viewer target properly detects CI environments and gracefully skips/continues on installation failures, allowing the workflow to proceed without HTML generation when the tool isn't available.

Also applies to: 66-82


173-199: Test reporting workflow properly handles CI vs. local environments.

The test target correctly uses gotestsum for structured reporting and conditionally generates HTML reports only outside CI environments. The fallback to JUnit XML when junit-viewer is unavailable is user-friendly.

core/providers/openrouter.go (3)

92-92: LGTM: Minor formatting cleanup.

The extra space removal improves code consistency.


137-137: LGTM: Streaming methods successfully migrated to unified fasthttp client.

The migration from streamClient to provider.client across all three streaming methods (TextCompletionStream, ChatCompletionStream, and ResponsesStream) is correct and aligns with the PR objective to consolidate on a single fasthttp client for both regular and streaming operations.

Also applies to: 172-172, 203-203


218-218: LGTM: Dynamic provider key improves maintainability.

Replacing the hardcoded "openrouter" string with string(provider.GetProviderKey()) across all unsupported operation errors eliminates duplication and ensures consistency. This refactor aligns with similar changes made across other providers.

Also applies to: 223-223, 228-228, 233-233, 238-238

core/providers/ollama.go (2)

96-96: Verify streaming functionality with unified fasthttp client.

The migration from separate streamClient to the unified provider.client (fasthttp) for streaming operations looks correct. However, ensure that:

  1. The ReadTimeout and WriteTimeout configured on the client (lines 32-33) are appropriate for long-lived streaming connections.
  2. Streaming responses are not prematurely terminated due to timeout settings.
  3. The fasthttp client properly handles chunked transfer encoding and Server-Sent Events (SSE) for streaming.

Consider adding integration tests that verify streaming behavior, particularly for long-running streams that exceed typical request durations.

Also applies to: 131-131


34-36: The review comment should be disregarded.

The flagged MaxConnsPerHost: 10000 configuration is not unique to the ollama.go change—it is a codebase-wide standard applied consistently across all 13 provider implementations (Azure, Vertex, OpenAI, Anthropic, Mistral, Gemini, Cohere, Cerebras, Groq, OpenRouter, Parasail, SGL, and Ollama). This indicates an intentional, uniform design choice for the codebase, not a novel or problematic pattern introduced by this change. The original review comment's framing assumes this configuration is specific to or problematic within ollama.go, which is incorrect given the broader context.

Likely an incorrect or invalid review comment.

core/providers/cohere/cohere.go (1)

47-49: Confirm the intentionality of MaxConnsPerHost: 10000 system-wide.

This value of 10000 is consistently applied across all 13 provider implementations (vertex, openai, sgl, parasail, openrouter, ollama, groq, mistral, gemini, cerebras, azure, cohere, anthropic), suggesting a deliberate architectural choice. However, no documentation explains the rationale for this high connection limit.

Please confirm:

  • Whether 10000 is intentionally high for expected throughput/load scenarios
  • Whether this warrants adding configuration options to tune per provider or environment

@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from bb3b48d to a469bc6 Compare October 31, 2025 02:26
@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from 22dbf88 to e585c06 Compare October 31, 2025 02:26
@coderabbitai coderabbitai bot requested a review from danpiths October 31, 2025 02:42
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (8)
Makefile (1)

228-228: Fix incorrect relative paths in junit-viewer invocations.

The junit-viewer commands use incorrect relative paths (../../ or ../). After each cd && command; completes, the shell returns to the root directory, making these relative paths incorrect. Compare with line 322 in test-all, which correctly uses $(TEST_REPORTS_DIR) directly.

Apply these fixes:

# Line 228
-					junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \

# Line 249
-				junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
+				junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \

# Line 303
-					junit-viewer --results=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \

Also applies to: 249-249, 303-303

core/providers/parasail.go (1)

33-35: 10 000 connections per host is still too high—please dial it back.

Previous feedback noted that this cap should be closer to 500–1000 (or configurable) to avoid exhausting resources; the literal 10 000 is still here. Please adjust and propagate across the shared fasthttp client setup.

core/providers/openrouter.go (1)

35-37: Revisit the 10 000 per-host connection limit.

We already highlighted that 10 000 is an unsafe default for fasthttp; please reduce it to a reasonable ceiling (≈512–1024) or make it configurable, and align the rest of the providers in this PR.

core/providers/anthropic/anthropic.go (1)

676-692: Undefined variable 'err' causes compile failure.

Line 676 assigns bifrostErr from MakeRequestWithContext, but lines 679 and 689 reference undefined variable err. Replace all err references with bifrostErr, or simply return bifrostErr directly as suggested in the previous review.

Apply this diff:

 _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 if bifrostErr != nil {
   defer fasthttp.ReleaseResponse(resp)
-  if errors.Is(err, context.Canceled) {
-    return nil, &schemas.BifrostError{
-      IsBifrostError: false,
-      Error: &schemas.ErrorField{
-        Type:    schemas.Ptr(schemas.RequestCancelled),
-        Message: schemas.ErrRequestCancelled,
-        Error:   err,
-      },
-    }
-  }
-  if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-    return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-  }
-  return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+  return nil, bifrostErr
 }
core/providers/cohere/cohere.go (2)

296-310: Duplicate Content-Type header setting still present.

Line 298 calls SetContentType("application/json") and line 305 calls Set("Content-Type", "application/json"). Remove the duplicate at line 305.


566-577: Duplicate Content-Type header setting still present.

Line 568 calls SetContentType("application/json") and line 572 calls Set("Content-Type", "application/json"). Remove the duplicate at line 572.

core/providers/vertex/vertex.go (1)

145-148: ListModels should use GET, not POST.

The Vertex AI models.list endpoint requires HTTP GET with no body. Using POST will likely result in 405 Method Not Allowed or 400 Bad Request.

Apply this diff:

-req.Header.SetMethod(http.MethodPost)
+req.Header.SetMethod(http.MethodGet)
 req.SetRequestURI(requestURL)
-req.Header.SetContentType("application/json")
+// No Content-Type needed for GET request
core/providers/openai/openai.go (1)

1653-1667: Duplicate header setting and wrong Content-Type overwrite multipart boundary.

This issue was flagged in a past review but still appears in the code:

  1. SetExtraHeaders is called twice (lines 1653 and 1660)
  2. SetContentType("application/json") at line 1657 overwrites the multipart Content-Type with boundary from writer.FormDataContentType()
  3. Headers loop is duplicated

Apply this diff to fix:

-	// Set any extra headers from network config
-	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
-	req.Header.SetMethod(http.MethodPost)
-	req.SetRequestURI(url)
-	req.Header.SetContentType("application/json")
-
-	// Set any extra headers from network config
-	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
-	// Set headers
-	for key, value := range headers {
-		req.Header.Set(key, value)
-	}
+	// Set method and URI first
+	req.Header.SetMethod(http.MethodPost)
+	req.SetRequestURI(url)
+	
+	// Apply provider extra headers once
+	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+	
+	// Apply multipart/event-stream/auth headers (includes correct Content-Type with boundary)
+	for key, value := range headers {
+		req.Header.Set(key, value)
+	}

This preserves the multipart Content-Type from writer.FormDataContentType() and removes duplicate calls.

🧹 Nitpick comments (4)
Makefile (1)

173-199: Consider refactoring the test target for brevity.

The static analysis tool (checkmake) flags that the test target body exceeds the recommended maximum of 5 lines (currently 26). While the implementation is correct, consider extracting the HTML report generation logic into a helper function or separate target to improve readability and reduce complexity.

This is a style preference and not blocking, but extracting something like:

generate-test-html: ## Generate HTML report for bifrost-http tests
	@if [ -z "$$CI" ] && which junit-viewer > /dev/null 2>&1; then \
		junit-viewer --results=$(TEST_REPORTS_DIR)/bifrost-http.xml --save=$(TEST_REPORTS_DIR)/bifrost-http.html 2>/dev/null && \
		echo "$(CYAN)HTML report: $(TEST_REPORTS_DIR)/bifrost-http.html$(NC)"; \
	fi

would reduce duplication and make the test target more maintainable.

core/providers/gemini/gemini.go (1)

425-428: Consider larger scanner buffer for large audio chunks.

The default bufio.Scanner uses a 64KB token limit. For audio streaming with large chunks (buffer already increased to 10MB max), this is likely adequate. However, if "token too long" errors occur in production, consider using bufio.Reader with ReadBytes('\n') instead.

core/providers/vertex/vertex.go (1)

628-631: Remove redundant empty lines and comments.

Lines 628-631 contain only whitespace and a redundant comment without actual code. This appears to be leftover from refactoring.

Apply this diff:

 req.SetBody(jsonBody)
 
-	
-	// Set any extra headers from network config
-	
-
 	// Make the request
core/providers/openai/openai.go (1)

355-355: Consider increasing scanner buffer for large SSE chunks.

Per past review, the default bufio.Scanner 64KB token limit can truncate long SSE JSON lines. For text completion streaming, this is less critical but worth monitoring.

If "token too long" errors occur, apply this pattern:

 scanner := bufio.NewScanner(resp.BodyStream())
+buf := make([]byte, 0, 64*1024)
+scanner.Buffer(buf, 10*1024*1024)

Based on learnings

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb3b48d and a469bc6.

📒 Files selected for processing (20)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/sgl.go
  • core/providers/utils/utils.go
  • tests/core-providers/config/account.go
  • core/providers/mistral/mistral.go
🧰 Additional context used
🧬 Code graph analysis (8)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/bedrock/bedrock.go (2)
core/schemas/bifrost.go (1)
  • RequestCancelled (282-282)
core/schemas/provider.go (1)
  • ErrRequestCancelled (23-23)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/anthropic/anthropic.go (2)
core/providers/anthropic/types.go (1)
  • AnthropicError (315-321)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/openai/openai.go (2)
core/schemas/provider.go (1)
  • DefaultRequestTimeoutInSeconds (14-14)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (23)
.gitignore (1)

38-40: LGTM!

The addition of the test-reports directory to .gitignore is appropriate and aligns with the test reporting infrastructure being introduced.

Makefile (2)

37-43: LGTM!

The documentation for test reporting configuration variables is clear and well-integrated into the help output.


62-82: LGTM!

Both installation targets follow good patterns: dependency checks, CI detection to skip unnecessary work, and helpful error messages. The graceful failure handling in install-junit-viewer is appropriate.

core/providers/azure/azure.go (2)

24-51: LGTM! Clean consolidation to single fasthttp client.

The removal of streamClient and the connection pool tuning (10K max conns, 60s idle timeout, 10s wait timeout) are appropriate for the unified fasthttp-based streaming approach.


277-277: Streaming now correctly uses the unified client.

Passing provider.client to the OpenAI streaming handler maintains consistency with the removal of the separate streamClient.

core/providers/cerebras.go (3)

85-98: LGTM - Streaming correctly delegates to fasthttp-based handler.

The TextCompletionStream method correctly passes provider.client (fasthttp) to the OpenAI-compatible streaming handler. Resource lifecycle is properly managed by the delegated function.


119-133: LGTM - ChatCompletionStream correctly migrated to fasthttp.

The ChatCompletionStream method correctly passes provider.client (fasthttp) to the shared OpenAI streaming handler. No resource management concerns since the handler manages the fasthttp lifecycle.


33-35: MaxConnsPerHost=10000 is a codebase-wide standard, not a Cerebras-specific issue.

The script confirmed that MaxConnsPerHost: 10000 is configured identically across all 13 provider files in the codebase (vertex, anthropic, openai, gemini, azure, cohere, groq, mistral, ollama, openrouter, parasail, sgl, and cerebras). This is a uniform project-wide design decision, not a Cerebras-specific configuration. The review comment incorrectly singles out the Cerebras provider as if it were a localized concern.

If the value of 10000 is a production concern, it should be flagged codebase-wide with proper context (e.g., production load expectations, OS limits, Little's Law calculations), not attributed to a single provider file.

Likely an incorrect or invalid review comment.

core/providers/gemini/gemini.go (4)

228-247: LGTM - ChatCompletionStream correctly migrated to fasthttp.

The method correctly passes provider.client (fasthttp) to the shared OpenAI streaming handler. Resource lifecycle is properly managed by the delegated function.


367-423: LGTM - Streaming lifecycle correctly fixed per past review.

The response lifecycle issue flagged in past reviews has been properly addressed:

  • Request released immediately (not used by goroutine)
  • Response released only inside the goroutine after streaming completes
  • Error paths also release response before returning

642-698: LGTM - TranscriptionStream response lifecycle correctly implemented.

The fasthttp response lifecycle is properly managed:

  • Request released immediately after acquisition
  • Response released only inside the goroutine after streaming completes
  • Error paths correctly release response before returning

935-959: LGTM - parseStreamGeminiError correctly migrated to fasthttp.

The function signature and implementation correctly use *fasthttp.Response. Response body and status code are accessed safely without holding references after return.

core/providers/vertex/vertex.go (3)

87-111: LGTM - getAuthTokenSource correctly refactored to return TokenSource.

The function properly obtains an OAuth2 TokenSource from either default credentials or JWT config. This simplifies authentication by removing client pooling logic while maintaining proper credential handling.


311-353: LGTM - ChatCompletion correctly uses fasthttp with OAuth token.

The method properly:

  • Manages fasthttp request/response lifecycle
  • Obtains OAuth token from TokenSource
  • Sets Authorization header with token.AccessToken
  • Handles all error paths correctly

485-508: LGTM - Streaming auth correctly fixed to use OAuth token.

Per past review, the streaming authentication now correctly uses the OAuth access token from getAuthTokenSource instead of raw key.Value. This ensures consistent authentication across all Vertex AI operations.

core/providers/openai/openai.go (8)

303-353: LGTM - Streaming response lifecycle correctly managed.

The fasthttp response lifecycle is properly implemented:

  • Request released immediately
  • Response released only in goroutine after streaming
  • Error paths correctly release response
  • Context cancellation properly checked

660-711: LGTM - ChatCompletionStream response lifecycle correct.

Response acquisition, streaming, and release are properly managed. Context cancellation is checked in the streaming loop.


1020-1084: LGTM - ResponsesStream properly implements fasthttp lifecycle.

Response management and context cancellation are correctly implemented.


1370-1445: LGTM - SpeechStream fasthttp lifecycle correctly implemented.

Response acquisition, streaming, and release are properly managed with context cancellation checks.


1670-1715: LGTM - TranscriptionStream response lifecycle correctly managed.

Despite the header issue noted separately, the fasthttp response lifecycle and context cancellation checks are properly implemented.


1849-1877: LGTM - ParseOpenAIError correctly uses fasthttp.Response.

Function signature and implementation properly migrated to use *fasthttp.Response.


1880-1905: LGTM - parseStreamOpenAIError correctly migrated to fasthttp.

Consistent with ParseOpenAIError, properly uses *fasthttp.Response.


366-370: Excellent - Context cancellation checks added to all streaming loops.

The consistent pattern of checking ctx.Done() before processing each chunk allows for graceful cancellation of streaming operations. This is a good improvement for resource cleanup and responsiveness.

Also applies to: 724-729, 1079-1084, 1440-1445, 1710-1715

@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from e585c06 to b99f41c Compare October 31, 2025 23:40
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from a469bc6 to 66b74e9 Compare October 31, 2025 23:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/providers/vertex/vertex.go (1)

516-519: Non-Claude streaming should use OAuth token, not key.Value.

For consistency and correctness, non-Claude streaming endpoints should also use the OAuth token flow via getAuthTokenSource rather than key.Value.

Apply this diff:

 		authHeader := map[string]string{}
-		if key.Value != "" {
-			authHeader["Authorization"] = "Bearer " + key.Value
+		tokenSource, err := getAuthTokenSource(key)
+		if err != nil {
+			return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+		}
+		token, err := tokenSource.Token()
+		if err != nil {
+			return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
 		}
+		authHeader["Authorization"] = "Bearer " + token.AccessToken
core/providers/gemini/gemini.go (1)

676-693: Propagate the real streaming error instead of fabricating a generic one.

MakeRequestWithContext already folds context cancellation, deadlines, and transport failures into bifrostErr. By switching to a fresh NewBifrostOperationError and inspecting the unrelated err from the earlier marshal, we lose the cancellation classification and overwrite the provider-specific payload. Downstream code will now see every cancelled stream as ErrProviderRequest, which is a regression.

Please short-circuit with the original error and release the response before returning:

-	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
-	if bifrostErr != nil {
-		defer fasthttp.ReleaseResponse(resp)
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
-	}
+	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+	if bifrostErr != nil {
+		fasthttp.ReleaseResponse(resp)
+		return nil, bifrostErr
+	}

This keeps the original error semantics intact while still freeing the pooled response.

core/providers/anthropic/anthropic.go (1)

676-693: Return the MakeRequestWithContext error directly.

As in the Gemini provider, the new code replaces the structured bifrostErr with a generic fallback and inspects the unrelated err from JSON marshaling. That drops the RequestCancelled/timeout metadata and masks transport errors.

Please release the response and propagate the original error:

-	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
-	if bifrostErr != nil {
-		defer fasthttp.ReleaseResponse(resp)
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
-	}
+	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+	if bifrostErr != nil {
+		fasthttp.ReleaseResponse(resp)
+		return nil, bifrostErr
+	}

That way the streamed caller still gets the precise reason (cancelled, timed out, etc.) instead of a generic provider error.

♻️ Duplicate comments (16)
core/providers/groq.go (1)

33-36: MaxConnsPerHost value already flagged.

This concern about setting MaxConnsPerHost to 10000 has already been raised in a previous review. Please address the earlier feedback to use a configurable, safer default.

Makefile (1)

228-228: Fix incorrect relative paths in junit-viewer invocations.

The junit-viewer commands on lines 228, 249, and 303 use relative paths (../../$(TEST_REPORTS_DIR) or ../$(TEST_REPORTS_DIR)), but the working directory is already at root after the preceding cd && command chains complete. This causes junit-viewer to search in incorrect directories and fail.

Compare with line 322 in test-all, which correctly uses $(TEST_REPORTS_DIR) directly without relative path prefixes.

Apply this diff to fix line 228:

-					junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \

Apply the same fix to line 249:

-				junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
+				junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \

And to line 303:

-					junit-viewer --results=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \

Also applies to: 249-249, 303-303

core/providers/sgl.go (1)

34-36: MaxConnsPerHost remains excessively high.

This issue was previously flagged: setting MaxConnsPerHost to 10,000 is unreasonably high and risks resource exhaustion. As noted in the earlier review, typical production values range from 100-500.

core/providers/mistral/mistral.go (1)

33-35: MaxConnsPerHost value already flagged in previous review.

The configuration of MaxConnsPerHost = 10000 has already been identified as excessively high in a previous review comment. As noted, this value appears uniformly across all 13 providers and should be reduced to a safer default (e.g., 1000) or made configurable via ProviderConfig to prevent resource exhaustion.

core/providers/openrouter.go (1)

35-37: MaxConnsPerHost value already flagged in previous reviews.

This configuration has been extensively reviewed and flagged as excessive. Please refer to the previous review comments for detailed analysis and recommendations to reduce this to 1024-2048 or make it configurable.

core/providers/bedrock/signer.go (3)

193-197: Don't sign Content-Length unless the header is present.

fasthttp computes the outgoing Content-Length itself; injecting our own value from len(body) risks signing a header that either isn’t actually sent or has a different value, producing broken signatures. Read the header that’s already on the request and only sign it when present.

-	// Include content-length if body is present
-	if len(body) > 0 {
-		headerNames = append(headerNames, "content-length")
-		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
-	}
+	if cl := req.Header.ContentLength(); cl >= 0 {
+		headerNames = append(headerNames, "content-length")
+		headerMap["content-length"] = []string{strconv.Itoa(cl)}
+	}

200-217: Use fasthttp’s header iterator API.

RequestHeader has no All() helper, so this code does not compile. Iterate with VisitAll instead.

-	// Collect other headers
-	for key, value := range req.Header.All() {
-		keyStr := strings.ToLower(string(key))
-
-		// Skip ignored headers
-		if _, ignore := ignoredHeaders[keyStr]; ignore {
-			continue
-		}
-
-		// Skip if already handled
-		if keyStr == "host" || keyStr == "content-length" {
-			continue
-		}
-
-		if _, exists := headerMap[keyStr]; !exists {
-			headerNames = append(headerNames, keyStr)
-		}
-		headerMap[keyStr] = append(headerMap[keyStr], string(value))
-	}
+	// Collect other headers
+	req.Header.VisitAll(func(key, value []byte) {
+		keyStr := strings.ToLower(string(key))
+
+		// Skip ignored headers
+		if _, ignore := ignoredHeaders[keyStr]; ignore {
+			return
+		}
+
+		// Skip if already handled
+		if keyStr == "host" || keyStr == "content-length" {
+			return
+		}
+
+		if _, exists := headerMap[keyStr]; !exists {
+			headerNames = append(headerNames, keyStr)
+		}
+		headerMap[keyStr] = append(headerMap[keyStr], string(value))
+	})

242-258: Canonicalize the query with RFC 3986 encoding per SigV4.

url.ParseQuery/Encode apply form rules (+ ↔ space) and reorder data differently from AWS SigV4 requirements, so certain queries (spaces, literal +, already-escaped bytes) produce invalid signatures. Re-encode each name/value using RFC 3986, sort on the encoded pairs, and join manually.

-	if queryString != "" {
-		values, _ := url.ParseQuery(queryString)
-		// Sort keys
-		keys := make([]string, 0, len(values))
-		for k := range values {
-			keys = append(keys, k)
-		}
-		sort.Strings(keys)
-
-		// Sort values for each key
-		for _, k := range keys {
-			sort.Strings(values[k])
-		}
-
-		canonicalQueryString = values.Encode()
-		canonicalQueryString = strings.ReplaceAll(canonicalQueryString, "+", "%20")
-	}
+	if queryString != "" {
+		rawParams := strings.Split(queryString, "&")
+		type pair struct {
+			name  string
+			value string
+		}
+		pairs := make([]pair, 0, len(rawParams))
+		for _, raw := range rawParams {
+			if raw == "" {
+				continue
+			}
+
+			var rawName, rawValue string
+			if idx := strings.IndexByte(raw, '='); idx >= 0 {
+				rawName = raw[:idx]
+				rawValue = raw[idx+1:]
+			} else {
+				rawName = raw
+				rawValue = ""
+			}
+
+			nameDecoded, err := url.PathUnescape(rawName)
+			if err != nil {
+				return providerUtils.NewBifrostOperationError("failed to decode query parameter name for signing", err, providerName)
+			}
+			valueDecoded, err := url.PathUnescape(rawValue)
+			if err != nil {
+				return providerUtils.NewBifrostOperationError("failed to decode query parameter value for signing", err, providerName)
+			}
+
+			pairs = append(pairs, pair{
+				name:  encodeRFC3986(nameDecoded),
+				value: encodeRFC3986(valueDecoded),
+			})
+		}
+
+		sort.Slice(pairs, func(i, j int) bool {
+			if pairs[i].name == pairs[j].name {
+				return pairs[i].value < pairs[j].value
+			}
+			return pairs[i].name < pairs[j].name
+		})
+
+		var builder strings.Builder
+		for i, pair := range pairs {
+			if i > 0 {
+				builder.WriteByte('&')
+			}
+			builder.WriteString(pair.name)
+			builder.WriteByte('=')
+			builder.WriteString(pair.value)
+		}
+
+		canonicalQueryString = builder.String()
+	}
func encodeRFC3986(value string) string {
	var builder strings.Builder
	builder.Grow(len(value))
	for i := 0; i < len(value); i++ {
		b := value[i]
		if (b >= 'A' && b <= 'Z') || (b >= 'a' && b <= 'z') ||
			(b >= '0' && b <= '9') || b == '-' || b == '_' || b == '.' || b == '~' {
			builder.WriteByte(b)
			continue
		}
		builder.WriteByte('%')
		builder.WriteByte("0123456789ABCDEF"[b>>4])
		builder.WriteByte("0123456789ABCDEF"[b&0x0F])
	}
	return builder.String()
}
core/providers/ollama.go (1)

34-36: MaxConnsPerHost value is excessively high.

Setting MaxConnsPerHost to 10,000 can exhaust system resources under load. Typical production deployments use 100-500, with high-throughput workloads rarely exceeding 1,000.

Based on learnings

core/providers/parasail.go (1)

33-36: MaxConnsPerHost value is excessively high.

Setting MaxConnsPerHost to 10,000 can exhaust system resources under load. Consider reducing to a safer default (e.g., 500-1,000) or making this configurable.

Based on learnings

core/providers/cohere/cohere.go (3)

47-49: MaxConnsPerHost value is excessively high.

Setting MaxConnsPerHost to 10,000 can exhaust system resources under load. Typical production systems use 100-500, with high-throughput workloads rarely exceeding 1,000. Consider making this configurable.

Based on learnings


296-310: Remove duplicate Content-Type header setting.

The Content-Type header is set twice: once at line 298 using SetContentType("application/json") and again at line 305 using Set("Content-Type", "application/json"). The second call is redundant.

Apply this diff to remove the duplicate:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 
 	// Set any extra headers from network config
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 	
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)
 	req.Header.Set("Accept", "text/event-stream")
 	req.Header.Set("Cache-Control", "no-cache")

566-577: Remove duplicate Content-Type header setting.

Similar to ChatCompletionStream, the Content-Type header is set twice: once at line 568 using SetContentType and again at line 572 using Set.

Apply this diff:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 		
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)
 	req.Header.Set("Accept", "text/event-stream")
 	req.Header.Set("Cache-Control", "no-cache")
core/providers/vertex/vertex.go (2)

72-74: MaxConnsPerHost value is excessively high.

Setting MaxConnsPerHost to 10,000 can exhaust system resources under load. Consider reducing to a safer default (e.g., 500-1,000) or making this configurable via config.NetworkConfig.

Based on learnings


145-148: ListModels should use GET, not POST.

The Vertex API projects.locations.models.list endpoint requires an HTTP GET request with no body. Using POST here will likely result in a 405 or 400 error.

Apply this diff:

-	req.Header.SetMethod(http.MethodPost)
+	req.Header.SetMethod(http.MethodGet)
 	req.SetRequestURI(requestURL)
-	req.Header.SetContentType("application/json")
core/providers/openai/openai.go (1)

1653-1667: Fix duplicate header setting and content-type overwrite in TranscriptionStream.

This code has three issues:

  1. SetExtraHeaders is called twice (lines 1653 and 1659-1660)
  2. Line 1657 sets ContentType to "application/json", overriding the correct multipart content type
  3. The multipart content type from writer.FormDataContentType() (line 1638) is eventually restored by line 1664, but this is inefficient and error-prone

Note: A similar issue was flagged in a previous review and marked as addressed, but appears to still be present.

Apply this diff to fix:

- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(url)
- req.Header.SetContentType("application/json")
-
- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
- // Set headers
- for key, value := range headers {
-     req.Header.Set(key, value)
- }
+ req.Header.SetMethod(http.MethodPost)
+ req.SetRequestURI(url)
+ 
+ // Apply provider extra headers once
+ providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+ 
+ // Apply multipart and auth headers (includes correct Content-Type)
+ for key, value := range headers {
+     req.Header.Set(key, value)
+ }

This preserves the multipart content type from writer.FormDataContentType() and eliminates duplicates.

🧹 Nitpick comments (5)
Makefile (2)

1-20: Define TEST_REPORTS_DIR and GOTESTSUM_FORMAT at the top of the Makefile.

The new test infrastructure references $(TEST_REPORTS_DIR) and $(GOTESTSUM_FORMAT) throughout, but these variables are not defined in the Makefile's variables section (lines 3–9). This makes the configuration implicit and harder to track. While environment variables can override them, having explicit defaults in the Makefile improves clarity and reproducibility.

Apply this diff to add variable definitions near the top of the Makefile:

 # Variables
 HOST ?= localhost
 PORT ?= 8080
 APP_DIR ?= 
 PROMETHEUS_LABELS ?=
 LOG_STYLE ?= json
 LOG_LEVEL ?= info
+TEST_REPORTS_DIR ?= test-reports
+GOTESTSUM_FORMAT ?= testname

173-173: Makefile target body exceeds recommended length.

The test target now contains 26 lines (lines 173–199), exceeding the typical Makefile best practice of 5 lines per target. While this is not a blocker, consider extracting the HTML report generation logic into a separate helper target to improve maintainability and readability.

core/providers/mistral/mistral.go (1)

39-41: Remove or restore commented-out response pool pre-warming code.

The commented-out code for pre-warming response pools should either be removed if no longer needed, or uncommented and implemented if the feature is required. Leaving dead code in comments reduces maintainability.

Apply this diff to remove the commented code:

-	// Pre-warm response pools
-	// for range config.ConcurrencyAndBufferSize.Concurrency {
-	// 	mistralResponsePool.Put(&schemas.BifrostResponse{})
-	// }
-
core/providers/parasail.go (1)

38-38: Remove stray empty line.

Line 38 appears to be a leftover blank line that can be removed for cleaner formatting.

core/providers/vertex/vertex.go (1)

628-631: Remove duplicate and empty comment blocks.

Lines 628-630 contain duplicate/empty comment sections that should be cleaned up.

Apply this diff:

 	req.SetBody(jsonBody)
 
-	
-	// Set any extra headers from network config
-	
-
 	// Make the request
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a469bc6 and 66b74e9.

⛔ Files ignored due to path filters (5)
  • framework/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (40)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/go.mod (1 hunks)
  • core/providers/anthropic/anthropic.go (11 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • framework/go.mod (1 hunks)
  • plugins/logging/go.mod (2 hunks)
  • plugins/otel/go.mod (1 hunks)
  • plugins/telemetry/go.mod (2 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • transports/go.mod (0 hunks)
  • ui/app/clientLayout.tsx (1 hunks)
  • ui/app/layout.tsx (1 hunks)
  • ui/app/page.tsx (1 hunks)
  • ui/app/workspace/logs/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1 hunks)
  • ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1 hunks)
  • ui/app/workspace/plugins/views/pluginsView.tsx (2 hunks)
  • ui/app/workspace/providers/page.tsx (2 hunks)
  • ui/components/sidebar.tsx (2 hunks)
  • ui/next.config.ts (1 hunks)
  • ui/tsconfig.json (1 hunks)
💤 Files with no reviewable changes (1)
  • transports/go.mod
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/utils/utils.go
  • scheams/go.mod
  • core/providers/bedrock/bedrock.go
  • core/providers/cerebras.go
🧰 Additional context used
🧬 Code graph analysis (11)
ui/app/clientLayout.tsx (2)
ui/hooks/useWebSocket.tsx (1)
  • WebSocketProvider (26-162)
ui/components/themeProvider.tsx (1)
  • ThemeProvider (6-8)
ui/app/layout.tsx (1)
ui/app/clientLayout.tsx (1)
  • ClientLayout (38-51)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
ui/next.config.ts (1)
ui/scripts/fix-paths.js (1)
  • path (4-4)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/anthropic/anthropic.go (3)
  • AcquireAnthropicChatResponse (44-48)
  • ReleaseAnthropicChatResponse (51-55)
  • HandleAnthropicChatCompletionStreaming (385-578)
core/providers/openai/openai.go (2)
core/schemas/provider.go (1)
  • DefaultRequestTimeoutInSeconds (14-14)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • HandleProviderAPIError (207-227)
core/providers/anthropic/anthropic.go (3)
core/providers/anthropic/types.go (1)
  • AnthropicError (315-321)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/anthropic/utils.go (1)
  • ConvertAnthropicFinishReasonToBifrost (27-32)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/cohere/cohere.go (1)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (40)
core/go.mod (1)

11-11: Move smithy-go to direct dependency is appropriate.

Promoting smithy-go from an indirect to explicit direct dependency aligns with the PR's addition of a fasthttp AWS signer for Bedrock. This is a best practice when a module directly uses a library.

Please verify that smithy-go v1.23.1 remains compatible with aws-sdk-go-v2 v1.39.3 (line 8) and does not introduce transitive dependency conflicts. You can validate this by running go mod tidy and go mod verify locally to ensure dependency resolution is sound.

plugins/otel/go.mod (1)

10-10: Patch-level version bump for google.golang.org/grpc looks safe.

The update from v1.75.0 to v1.76.0 is a patch-level bump and should be backward compatible per semver. Since the PR reports successful local builds, this is a low-risk change.

That said, verify that this grpc version bump is intentional for this PR. The PR objectives focus on fasthttp/streaming refactoring rather than grpc updates, so confirm this is a transitive dependency bump or a necessary update for the otel plugin integration.

plugins/telemetry/go.mod (1)

18-31: Verify compatibility of AWS SDK v1.39.3 and gRPC v1.76.0 with the core modules.

The go.mod file bumps AWS SDK v2 dependencies from v1.39.2-era to v1.39.3, and updates gRPC to v1.76.0. These appear reasonable for a fasthttp-focused refactor, but we should verify they are compatible with the direct dependencies in the require block (core v1.2.16 and framework v1.1.19).

Also applies to: 109-110

framework/go.mod (1)

10-10: Clarify the necessity of dependency version bumps.

The PR objectives focus on migrating to fasthttp for streaming (issue #269), but the go.mod changes include bumps to go-redis, weaviate, and weaviate-go-client that don't appear directly related to this objective.

Are these bumps necessary for the fasthttp migration, or are they incidental/orthogonal changes that should be separated into a dedicated dependency-update PR?

If these bumps are intentional and necessary, please confirm that they don't introduce breaking changes or security issues. Run the following to inspect the changes:

Also applies to: 12-13

ui/app/workspace/providers/page.tsx (2)

3-3: Verify the updated import path resolves correctly.

Ensure that ModelProviderConfig exists at the new path and that the application builds successfully.


20-20: LGTM!

Removing the unused Info import is good housekeeping.

core/providers/groq.go (1)

179-179: Streaming refactor looks correct.

The change to use provider.client (fasthttp.Client) for streaming operations correctly implements the PR objective of consolidating all HTTP operations to fasthttp. The integration with the shared OpenAI-compatible streaming logic appears sound.

ui/app/workspace/logs/page.tsx (1)

3-6: LGTM! Clean import path refactoring.

The import path updates are well-isolated and align with the workspace restructuring. Since there are no logic changes, the risk of regression is minimal.

Verify that the TypeScript build passes to confirm all import paths resolve correctly:

.gitignore (1)

38-40: LGTM!

Appropriate addition to ignore generated test reports, consistent with the new test reporting infrastructure introduced in the Makefile.

core/providers/sgl.go (4)

56-56: LGTM!

Client assignment correctly uses the single fasthttp client instance, consistent with the PR objective to eliminate separate stream clients.


100-113: LGTM!

TextCompletionStream correctly uses provider.client for streaming operations, successfully migrating away from the separate stream client as intended in this refactor.


134-148: LGTM!

ChatCompletionStream correctly uses provider.client for streaming, completing the migration to a unified fasthttp client for all operations.


192-192: LGTM!

Using string(provider.GetProviderKey()) instead of hardcoded "sgl" improves maintainability and ensures consistency across unsupported operations.

Also applies to: 197-197, 202-202, 207-207

core/providers/mistral/mistral.go (1)

154-154: LGTM: Streaming now uses unified fasthttp client.

The change to use provider.client instead of a separate streamClient correctly implements the PR objective of consolidating all HTTP operations (including streaming) to use a single fasthttp.Client instance. This simplifies the provider implementation and eliminates redundancy.

tests/core-providers/config/account.go (2)

16-17: Verify that uniform concurrency is appropriate for all providers.

The introduction of a single Concurrency = 4 constant replaces what appear to be provider-specific values. While this improves maintainability, different providers have varying characteristics (rate limits, reliability, latency). Ensure that a uniform concurrency value of 4 is appropriate for all providers in test scenarios, especially if some previously used higher values (e.g., 10) or lower values (e.g., 1-2).

Consider adding a comment explaining the rationale for this value and why uniform concurrency is suitable across all providers.


271-271: Good refactoring to eliminate magic numbers.

The consistent use of the Concurrency constant across all provider configurations improves maintainability and follows the DRY principle. This makes it easy to adjust the concurrency setting in one place.

Also applies to: 285-285, 311-311, 324-324, 337-337, 350-350, 363-363, 377-377, 390-390, 403-403, 417-417, 430-430, 443-443, 456-456, 469-469

plugins/logging/go.mod (2)

33-33: New transitive dependency is expected from the refactor.

The new indirect dependency github.com/bytedance/gopkg v0.1.3 is a transitive dependency pulled in by sonic v1.14.1 or another updated package. This is expected behavior during the fasthttp migration and does not introduce a direct dependency.


8-8: sonic v1.14.1 verified safe—no known security vulnerabilities.

No public CVE or advisory specifically reports security vulnerabilities in bytedance/sonic v1.14.1. The patch version bump is routine and can be approved.

ui/app/workspace/plugins/views/pluginsView.tsx (2)

3-4: LGTM! Import paths updated for workspace restructuring.

The import paths have been correctly updated to reflect the new workspace structure. Both CodeEditor and ConfirmDeletePluginDialog imports are consistent with the reorganization.


145-145: Padding removed from container div.

The className changed from "p-4" to an empty string, removing padding from the Plugin Configuration container. Verify this UI change is intentional and doesn't negatively impact the visual layout.

ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1)

7-7: LGTM! Import path updated for workspace restructuring.

The CodeEditor import path has been correctly updated to reflect the new workspace structure.

ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1)

3-3: LGTM! Import path updated for workspace restructuring.

The ClientForm import path has been correctly updated to reflect the new workspace structure.

ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1)

3-3: LGTM! Import path updated for workspace restructuring.

The CodeEditor import path has been correctly updated to reflect the new workspace structure.

ui/app/workspace/mcp-clients/page.tsx (1)

3-3: LGTM! Import path updated for workspace restructuring.

The MCPClientsTable import path has been correctly updated to reflect the new workspace structure.

core/providers/openrouter.go (2)

137-137: LGTM! Streaming client consolidation is well-executed.

The refactor successfully consolidates streaming operations to use the unified provider.client instead of maintaining a separate streamClient. This eliminates code duplication and achieves the PR objective of using fasthttp consistently across all request types.

Also applies to: 172-172, 203-203


32-38: Verify timeout configuration doesn't interrupt long-running streams.

Now that streaming uses this unified client, ReadTimeout and WriteTimeout apply to all streaming operations. If DefaultRequestTimeoutInSeconds is low (e.g., 30-60s), streams with idle periods between chunks could terminate prematurely. With fasthttp, these timeouts apply per read/write operation, not total request duration, but verify this doesn't cause issues for your streaming workloads.

Test streaming with various timeout configurations:

ui/components/sidebar.tsx (1)

76-187: LGTM! Workspace path migration is consistent.

All navigation URLs have been systematically updated to use the /workspace/ prefix. The changes are consistent and maintain proper absolute path formatting with leading slashes.

core/providers/ollama.go (1)

93-106: LGTM! Streaming now uses the shared fasthttp client.

The migration to use provider.client for streaming is correct and eliminates the need for a separate streaming client.

ui/tsconfig.json (1)

29-36: LGTM! Path aliases correctly updated for workspace structure.

The TypeScript path mappings for @enterprise/* and @schemas/* have been updated to reference the new app/workspace/ directory structure, with proper fallback paths maintained.

ui/next.config.ts (1)

27-31: LGTM! Webpack alias configuration updated for workspace structure.

The path resolutions for @enterprise and @schemas aliases have been correctly updated to reference the new app/workspace/ directory structure, maintaining consistency with the TypeScript configuration.

core/providers/parasail.go (1)

107-121: LGTM! Streaming correctly migrated to fasthttp client.

The migration to use provider.client for streaming operations is correct and aligns with the PR objectives.

core/providers/vertex/vertex.go (1)

87-111: LGTM! Token source approach improves auth handling.

The migration from getAuthClient to getAuthTokenSource is a cleaner approach that properly separates authentication concerns from HTTP client management.

ui/app/page.tsx (1)

6-6: Verify the missing leading slash in the redirect path.

The redirect target is "workspace/logs" without a leading slash, while the sidebar navigation URLs use absolute paths like "/workspace/logs". This inconsistency could cause routing issues.

core/providers/bedrock/chat.go (1)

7-7: Import path update looks good.

The new import tracks the package relocation cleanly. 👍

core/providers/openai/openai.go (6)

34-66: LGTM: Clean consolidation to single fasthttp client.

The constructor now creates a single fasthttp.Client for all operations (streaming and non-streaming), successfully eliminating the separate stream client. Configuration is sensible with appropriate timeouts, connection limits, and proxy support.


365-370: Excellent addition: Context cancellation checks in streaming loops.

The consistent pattern of checking ctx.Done() before processing each SSE line across all streaming handlers improves responsiveness and prevents unnecessary processing when the client disconnects.

Also applies to: 724-729, 1079-1084, 1440-1445, 1710-1715


336-340: LGTM: Proper fasthttp error handling.

Error handling correctly checks for fasthttp.ErrTimeout alongside context errors, ensuring timeouts are properly detected and reported regardless of the source.

Also applies to: 693-697, 1053-1057, 1413-1417, 1683-1687


96-99: LGTM: Proper fasthttp resource lifecycle management.

Request and response acquisition/release is handled correctly throughout the file with appropriate defer statements. Streaming goroutines also properly release responses after processing.

Also applies to: 185-188, 303-306, 539-542, 660-663, 899-902, 1020-1023, 1229-1232, 1304-1307, 1371-1374, 1556-1559, 1645-1648


250-250: LGTM: Consistent client parameter signatures.

All streaming and non-streaming handler functions now accept *fasthttp.Client, successfully consolidating on a single HTTP client implementation throughout the provider.

Also applies to: 607-607, 965-965, 264-266, 621-623, 980-982


1849-1877: LGTM: Error parsing functions updated for fasthttp.

Both ParseOpenAIError and parseStreamOpenAIError have been correctly updated to accept *fasthttp.Response, maintaining the same error parsing logic while adapting to the new HTTP client implementation.

Also applies to: 1880-1905

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
ui/components/sidebar.tsx (1)

303-303: Update hardcoded path to match workspace migration.

The condition checks for item.url === "/logs", but the Observability item URL was updated to "/workspace/logs" on Line 76. This means the WebSocket connection indicator will never display.

Apply this diff:

-							{!hasSubItems && item.url === "/logs" && isWebSocketConnected && (
+							{!hasSubItems && item.url === "/workspace/logs" && isWebSocketConnected && (
core/providers/vertex/vertex.go (1)

162-178: Return MakeRequestWithContext errors directly

In ListModels (Line [165]), ChatCompletion (Line [341]), and handleVertexEmbedding (Line [637]) we check bifrostErr != nil but then inspect the unrelated err captured from token acquisition. err is nil on the success path, so the timeout/cancel branches never fire and we replace the rich bifrostErr with a generic wrapper that carries a nil error. Please just propagate bifrostErr.

-	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, providerName)
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, providerName)
-	}
+	if bifrostErr != nil {
+		return nil, bifrostErr
+	}

Please apply the same fix to each occurrence.

Also applies to: 337-353, 633-649

core/providers/anthropic/anthropic.go (1)

676-693: Return bifrostErr instead of inspecting stale err

When MakeRequestWithContext fails in ResponsesStream (Line [676]) we interrogate the unrelated err from sonic.Marshal. That variable is nil on success, so the timeout/cancel checks never fire and we emit ErrProviderRequest with a nil underlying error. Just return bifrostErr; it already encodes the right type and message.

-	if bifrostErr != nil {
-		defer fasthttp.ReleaseResponse(resp)
-		if errors.Is(err, context.Canceled) {
-			return nil, &schemas.BifrostError{
-				IsBifrostError: false,
-				Error: &schemas.ErrorField{
-					Type:    schemas.Ptr(schemas.RequestCancelled),
-					Message: schemas.ErrRequestCancelled,
-					Error:   err,
-				},
-			}
-		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
-	}
+	if bifrostErr != nil {
+		defer fasthttp.ReleaseResponse(resp)
+		return nil, bifrostErr
+	}
♻️ Duplicate comments (11)
Makefile (1)

228-228: Fix: junit-viewer paths are incorrect—still using relative prefixes after directory changes complete.

These paths were flagged in prior review as needing correction, but the issue persists. After cd tests/core-providers && gotestsum ... completes, the if block executes at the repository root, so junit-viewer should use root-relative paths (no ../../ or ../ prefix). Compare with the correct pattern used in lines 183, 275, and 322.

Apply these diffs to fix all three locations:

- Line 228:
-					junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \

- Line 249:
-				junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
+				junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \

- Line 303:
-				junit-viewer --results=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
+				junit-viewer --results=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \

Also applies to: 249-249, 303-303

core/providers/groq.go (1)

33-36: MaxConnsPerHost=10000 remains unsafe; reduce or make configurable.

This value is duplicated across multiple providers and has been flagged repeatedly as a resource‑exhaustion risk. Typical production values are 512–1024. Consider extracting to a shared constant with a sane default or reading from configuration.

core/providers/sgl.go (1)

34-36: MaxConnsPerHost=10000 is duplicated and unsafe.

This high connection limit has been flagged repeatedly across providers. Reduce to 512–1024 or make it configurable via a shared constant or environment variable.

core/providers/openrouter.go (1)

35-38: MaxConnsPerHost=10000 duplicated across providers; reduce or centralize.

This value appears in 6+ provider files and has been flagged as a resource‑exhaustion risk. Extract to a shared default (512–1024) or configuration value.

core/providers/azure/azure.go (1)

38-40: MaxConnsPerHost=10000 duplicated; reduce or centralize configuration.

This unsafe value appears across 6+ providers. Extract to a shared constant (512–1024) or read from configuration.

core/providers/ollama.go (1)

34-36: MaxConnsPerHost=10000 duplicated and unsafe.

This value appears across multiple providers and risks resource exhaustion. Reduce to 512–1024 or make it configurable.

core/providers/bedrock/signer.go (1)

193-197: Content-Length signing could drift if fasthttp adjusts the header.

You sign content-length using len(body) before the request is finalized. If fasthttp modifies or omits the header (e.g., for chunked encoding), the signature will break. Consider only including content-length in signed headers if it's already set on req.Header and reading its value from there.

Apply this approach:

-	if len(body) > 0 {
-		headerNames = append(headerNames, "content-length")
-		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
-	}
+	if cl := req.Header.ContentLength(); cl >= 0 {
+		headerNames = append(headerNames, "content-length")
+		headerMap["content-length"] = []string{strconv.Itoa(cl)}
+	}
core/providers/vertex/vertex.go (2)

510-527: Use OAuth tokens for non-Claude streaming

Lines [517-519] still send the raw key.Value as Bearer auth on the OpenAI-style streaming endpoint. Vertex expects an OAuth access token here; using the API key will 401 (prior review already flagged this). Fetch the token via getAuthTokenSource, as you already do for the non-streaming code paths.

-	authHeader := map[string]string{}
-	if key.Value != "" {
-		authHeader["Authorization"] = "Bearer " + key.Value
-	}
+	authHeader := map[string]string{}
+	tokenSource, err := getAuthTokenSource(key)
+	if err != nil {
+		return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+	}
+	token, err := tokenSource.Token()
+	if err != nil {
+		return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
+	}
+	authHeader["Authorization"] = "Bearer " + token.AccessToken

145-148: ListModels must issue GET

Line [145] still sends the models list via POST with a JSON content-type. Vertex’s projects.locations.models.list endpoint only accepts GET, so this call will 405/400. Switch to GET and drop the JSON content-type.

-	req.Header.SetMethod(http.MethodPost)
-	req.Header.SetContentType("application/json")
+	req.Header.SetMethod(http.MethodGet)
+	req.Header.Del("Content-Type")
core/providers/parasail.go (1)

31-35: Reduce MaxConnsPerHost to a sane default

Line [33] still hard-codes MaxConnsPerHost to 10,000. That level of fan-out will burn sockets and upstream quotas long before backpressure kicks in—exactly what was called out in the earlier review. Please drop this to something in the 500-ish range (or wire it to config) before rolling out.

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,
core/providers/openai/openai.go (1)

355-355: bufio.Scanner 64KB token limit still not addressed.

The previous review flagged this issue, but it remains unresolved. Long SSE lines (JSON chunks) can exceed the default 64KB limit, causing "token too long" errors.

Apply a larger buffer after creating the scanner at all streaming locations (lines 355, 713, 1073, 1433, 1703):

 scanner := bufio.NewScanner(resp.BodyStream())
+// Raise max token size to 10MB to accommodate long SSE lines
+buf := make([]byte, 0, 64*1024)
+scanner.Buffer(buf, 10*1024*1024)

Also applies to: 713-713, 1073-1073, 1433-1433, 1703-1703

🧹 Nitpick comments (6)
framework/go.mod (1)

10-10: Clarify the rationale for dependency version updates.

The PR objectives focus on migrating providers to use fasthttp for streaming, yet three direct dependencies were updated:

  • redis/go-redis/v9: v9.12.1 → v9.14.0
  • weaviate/weaviate: v1.31.5 → v1.33.1
  • weaviate/weaviate-go-client/v5: v5.2.0 → v5.5.0

These updates appear orthogonal to the stated fasthttp migration goal. Please clarify:

  • Are these updates intentional and necessary for the refactor, or did they result from running go mod tidy?
  • Do any of these versions introduce breaking changes that affect the fasthttp migration or other parts of the codebase?

If these dependency updates are unrelated to this PR, consider moving them to a separate commit or PR for clarity and easier review/revert if needed.

Also applies to: 12-13

Makefile (1)

173-199: Consider splitting the test target to reduce cognitive complexity.

The static analysis tool flags that this target's body exceeds the recommended length. While this is a style concern rather than a functional issue, breaking it into smaller helper targets or extracting the HTML generation logic into a shared function could improve maintainability. This is optional but worth considering for future refactoring.

ui/app/layout.tsx (1)

23-24: Consider removing redundant preconnect links.

The Geist fonts are loaded via next/font/google (Lines 1, 5, 11), which automatically optimizes font loading and likely doesn't require manual preconnect hints to Google Fonts domains. If no other fonts are being loaded from Google Fonts, these preconnect links may be unnecessary.

If confirmed unnecessary, apply this diff:

-				<link rel="preconnect" href="https://fonts.googleapis.com" />
-				<link rel="preconnect" href="https://fonts.gstatic.com" crossOrigin="anonymous" />
tests/core-providers/config/account.go (1)

16-16: LGTM—extracting concurrency constant improves maintainability.

Replacing the hard-coded value 4 with the Concurrency constant eliminates magic numbers and provides a single point of configuration for test concurrency across all provider configs.

Also applies to: 271-271, 285-285, 311-311, 324-324, 337-337, 350-350, 363-363, 377-377, 390-390, 403-403, 417-417, 430-430, 443-443, 456-456, 469-469

core/providers/bedrock/signer.go (1)

242-259: Canonical query string encoding may corrupt literal '+' characters.

url.ParseQuery decodes form data (treating + as space), values.Encode() re-encodes with spaces as +, then ReplaceAll("+", "%20") replaces all + with %20. A literal + in the original query becomes a space after ParseQuery, gets re-encoded as +, then incorrectly becomes %20. AWS SigV4 requires RFC 3986 percent-encoding (space as %20, literal + as %2B). Consider manual encoding: split on &, percent-encode each name/value per RFC 3986, sort, and join.

core/providers/openai/openai.go (1)

1653-1665: Remove duplicate SetExtraHeaders call and unnecessary Content-Type override.

SetExtraHeaders is called twice (lines 1653 and 1660), and line 1657 sets Content-Type to "application/json" before it's overwritten by the headers loop with the correct multipart content type.

Apply this diff to clean up the redundant calls:

-// Set any extra headers from network config
-providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
-
 req.Header.SetMethod(http.MethodPost)
 req.SetRequestURI(url)
-req.Header.SetContentType("application/json")
 
 // Set any extra headers from network config
 providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 // Set headers
 for key, value := range headers {
     req.Header.Set(key, value)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a469bc6 and 66b74e9.

⛔ Files ignored due to path filters (5)
  • framework/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (40)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/go.mod (1 hunks)
  • core/providers/anthropic/anthropic.go (11 hunks)
  • core/providers/azure/azure.go (5 hunks)
  • core/providers/bedrock/bedrock.go (1 hunks)
  • core/providers/bedrock/chat.go (1 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (9 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (6 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (13 hunks)
  • framework/go.mod (1 hunks)
  • plugins/logging/go.mod (2 hunks)
  • plugins/otel/go.mod (1 hunks)
  • plugins/telemetry/go.mod (2 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • transports/go.mod (0 hunks)
  • ui/app/clientLayout.tsx (1 hunks)
  • ui/app/layout.tsx (1 hunks)
  • ui/app/page.tsx (1 hunks)
  • ui/app/workspace/logs/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1 hunks)
  • ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1 hunks)
  • ui/app/workspace/plugins/views/pluginsView.tsx (2 hunks)
  • ui/app/workspace/providers/page.tsx (2 hunks)
  • ui/components/sidebar.tsx (2 hunks)
  • ui/next.config.ts (1 hunks)
  • ui/tsconfig.json (1 hunks)
💤 Files with no reviewable changes (1)
  • transports/go.mod
🚧 Files skipped from review as they are similar to previous changes (4)
  • core/providers/utils/utils.go
  • scheams/go.mod
  • core/providers/bedrock/bedrock.go
  • core/providers/mistral/mistral.go
🧰 Additional context used
🧬 Code graph analysis (10)
ui/app/clientLayout.tsx (2)
ui/hooks/useWebSocket.tsx (1)
  • WebSocketProvider (26-162)
ui/components/themeProvider.tsx (1)
  • ThemeProvider (6-8)
core/providers/sgl.go (1)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (281-292)
ui/app/layout.tsx (1)
ui/app/clientLayout.tsx (1)
  • ClientLayout (38-51)
core/providers/cohere/cohere.go (2)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • NewProviderAPIError (343-358)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/openrouter.go (1)
core/providers/utils/utils.go (2)
  • HandleProviderResponse (233-277)
  • NewUnsupportedOperationError (281-292)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (96-144)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • MakeRequestWithContext (36-91)
  • HandleProviderResponse (233-277)
  • NewProviderAPIError (343-358)
core/schemas/account.go (1)
  • Key (8-17)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (328-339)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/openai/openai.go (1)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • HandleProviderAPIError (207-227)
core/providers/anthropic/anthropic.go (3)
core/providers/anthropic/types.go (1)
  • AnthropicError (315-321)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (152-173)
  • NewBifrostOperationError (328-339)
  • ProcessAndSendBifrostError (442-472)
  • MakeRequestWithContext (36-91)
core/providers/anthropic/utils.go (1)
  • ConvertAnthropicFinishReasonToBifrost (27-32)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (35)
plugins/otel/go.mod (1)

10-10: Clarify the purpose of the gRPC version update.

The PR objective focuses on migrating streaming to fasthttp, but this change updates google.golang.org/grpc from v1.75.0 to v1.76.0—a version bump that appears unrelated to the fasthttp migration.

Please clarify if this update is:

  1. Necessary for fasthttp compatibility or other functionality changes in this PR, or
  2. An incidental dependency update that should be separated into a distinct commit/PR for better change isolation.
plugins/logging/go.mod (1)

8-8: Dependency updates align with PR refactoring objectives.

The patch-level bump to bytedance/sonic (v1.14.0 → v1.14.1) is conservative and safe, and the addition of bytedance/gopkg as an indirect dependency is a normal transitive inclusion. The presence of fasthttp v1.67.0 (line 84) as an indirect dependency confirms alignment with the PR's shift to fasthttp for streaming operations.

Also applies to: 33-33

plugins/telemetry/go.mod (2)

18-31: Verify AWS SDK v2 dependency updates are necessary for fasthttp migration.

The AWS SDK v2 modules have been updated from v1.39.2 to v1.39.3 (patch-level updates) along with corresponding submodule versions. While the AI summary mentions AWS SigV4 signer implementation for fasthttp in Bedrock, I cannot directly see why the telemetry plugin's transitive AWS dependencies need these specific updates.

Please confirm:

  1. Are these AWS SDK version bumps required to support the new fasthttp-based Bedrock provider implementation?
  2. Were these updated automatically by go mod tidy or are they intentional pins for compatibility?
  3. Are there any known breaking changes or security fixes in v1.39.3 that necessitate this update?

109-110: Verify gRPC/genproto version bumps are compatible with the codebase.

The gRPC dependency has been updated from 1.75.x to 1.76.0 (minor version bump), and genproto from a 2025-08 to 2025-10 date-based version. These updates appear orthogonal to the fasthttp migration objective and may be collateral dependency chain updates.

Please verify:

  1. Are these gRPC/genproto updates pulled in by transitive dependencies or intentional upgrades?
  2. Have you tested that gRPC 1.76.0 is compatible with all services in the codebase?
  3. Are there any known compatibility issues with the 2025-10 genproto version?
.gitignore (1)

38-39: LGTM!

The addition aligns well with the test-reporting infrastructure changes in the Makefile and follows the existing pattern in the file.

Makefile (1)

37-38: Test infrastructure enhancements look solid.

The new targets for HTML report generation, installation guards for external tools, and CI-aware conditionals are well-structured. The install-junit-viewer target properly detects CI environments and gracefully falls back on failures.

Also applies to: 62-64, 66-82, 146-149, 151-171

core/providers/cerebras.go (1)

85-98: Streaming methods correctly use shared client.

The migration to use provider.client for streaming is implemented correctly. Both TextCompletionStream and ChatCompletionStream properly delegate to the OpenAI-compatible handlers with the shared fasthttp client. The refactor successfully eliminates the need for a separate stream client while maintaining the existing API contract.

Note: Ensure the timeout configuration concern raised in the constructor review is addressed to prevent streaming connections from timing out prematurely.

Also applies to: 119-133

ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1)

3-3: LGTM! Import path updated for workspace restructuring.

The import path has been correctly updated to align with the workspace-based module organization.

ui/next.config.ts (1)

27-31: LGTM! Webpack alias configuration updated for workspace structure.

The alias resolution for @enterprise and @schemas has been correctly updated to point to the new workspace-based paths, including both the main and fallback locations.

ui/app/workspace/plugins/views/pluginsView.tsx (2)

3-4: LGTM! Import paths updated for workspace restructuring.

The import paths have been correctly updated to align with the workspace-based module organization.


145-145: Verify intentional removal of padding.

The className has been changed from "p-4" to an empty string "", which removes the padding from the plugin configuration wrapper. This is a visual change that may affect the layout.

Please confirm this styling change is intentional and aligns with the desired UI layout.

ui/app/workspace/providers/page.tsx (2)

3-3: LGTM! Import path updated for workspace restructuring.

The import path has been correctly updated to align with the workspace-based module organization.


20-20: Good cleanup: unused import removed.

The Info icon import has been removed from the lucide-react import list, which is appropriate cleanup since it's not used in this file.

ui/tsconfig.json (1)

29-36: LGTM! TypeScript path mappings updated for workspace structure.

The path mappings for @enterprise/* and @schemas/* have been correctly updated to resolve to the new workspace-based paths, including fallback locations. This aligns with the webpack alias configuration in next.config.ts.

ui/app/workspace/mcp-clients/page.tsx (1)

3-3: LGTM! Import path updated for workspace restructuring.

The import path has been correctly updated to align with the workspace-based module organization.

ui/app/workspace/logs/page.tsx (1)

3-6: LGTM! Import paths updated for workspace restructuring.

All import paths have been correctly updated to align with the workspace-based module organization. The changes are consistent across all four imported components.

ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1)

3-3: LGTM! Import path updated for workspace migration.

The import path change aligns with the broader workspace restructuring mentioned in the PR summary.

ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1)

7-7: LGTM! CodeEditor import path updated for workspace structure.

The path change is consistent with the workspace migration affecting multiple UI components.

ui/components/sidebar.tsx (1)

76-187: LGTM! Navigation paths systematically updated for workspace migration.

All URL paths have been consistently updated to include the /workspace prefix, aligning with the broader workspace restructuring.

ui/app/clientLayout.tsx (1)

1-52: LGTM! Well-structured client layout consolidation.

The new ClientLayout component effectively centralizes provider setup and conditional rendering logic. The error handling via toast and the conditional rendering based on database connection status are implemented correctly.

ui/app/layout.tsx (2)

5-15: LGTM! Font configuration with swap display.

Adding display: "swap" improves font loading performance by showing fallback fonts while custom fonts load.


27-27: LGTM! Layout simplified with ClientLayout delegation.

Delegating to ClientLayout effectively consolidates provider setup and conditional rendering logic, making the root layout much cleaner.

core/go.mod (1)

11-11: No issues found — dependency promotion is justified.

smithy-go v1.23.1 is compatible with aws-sdk-go-v2 v1.39.3. The codebase confirms direct usage: core/providers/bedrock/signer.go imports github.com/aws/smithy-go/encoding/httpbinding for the AWS SigV4 implementation, validating the promotion from indirect to direct dependency.

core/providers/groq.go (1)

179-179: LGTM—streaming now uses the shared fasthttp client.

The migration from provider.streamClient to provider.client correctly consolidates all requests to use fasthttp as intended by this refactor.

core/providers/sgl.go (2)

103-103: LGTM—streaming consolidated to fasthttp client.

The change from provider.streamClient to provider.client correctly unifies streaming and non-streaming requests to use the shared fasthttp client.

Also applies to: 138-138


192-192: LGTM—dynamic provider key improves error messages.

Using string(provider.GetProviderKey()) instead of the hard-coded "sgl" makes error messages consistent and resilient to future provider key changes.

Also applies to: 197-197, 202-202, 207-207

core/providers/openrouter.go (2)

137-137: LGTM—streaming now uses shared fasthttp client.

Migration from provider.streamClient to provider.client correctly consolidates streaming under fasthttp.

Also applies to: 172-172, 203-203


218-218: LGTM—dynamic provider key in error messages.

Using string(provider.GetProviderKey()) instead of "openrouter" makes error construction consistent and maintainable.

Also applies to: 223-223, 228-228, 233-233, 238-238

core/providers/bedrock/chat.go (1)

7-7: LGTM—import path updated to reflect package reorganization.

The anthropic package has moved from core/schemas/providers/anthropic to core/providers/anthropic. No functional changes.

core/providers/azure/azure.go (1)

277-277: LGTM—streaming consolidated to fasthttp client.

The migration from provider.streamClient to provider.client correctly unifies all requests under fasthttp.

Also applies to: 362-362, 487-487

core/providers/ollama.go (1)

96-96: LGTM—streaming now uses shared fasthttp client.

The change from provider.streamClient to provider.client correctly consolidates streaming to fasthttp.

Also applies to: 131-131

core/providers/openai/openai.go (4)

37-43: LGTM: fasthttp.Client configuration looks appropriate.

The client configuration properly applies timeouts from the config and sets reasonable connection pool limits for fasthttp.


366-370: LGTM: Context cancellation checks properly implemented.

All streaming handlers now check for context cancellation before processing each chunk, ensuring graceful shutdown and preventing unnecessary work after cancellation.

Also applies to: 724-729, 1079-1084, 1440-1445, 1710-1715


323-340: LGTM: Error handling properly migrated to fasthttp.

Error handling correctly checks for fasthttp.ErrTimeout and context errors, with proper resource cleanup using deferred Release calls.

Also applies to: 680-697, 1040-1057, 1400-1417, 1670-1687


84-94: LGTM: Function signatures consistently updated to fasthttp.

All handler function signatures have been updated to accept *fasthttp.Client and *fasthttp.Response, maintaining consistency across the provider implementation.

Also applies to: 169-179, 264-275, 516-526, 621-632, 876-886, 980-992, 1206-1216, 1849-1849, 1880-1880

@akshaydeo akshaydeo mentioned this pull request Nov 1, 2025
16 tasks
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from 66b74e9 to 9ab2bae Compare November 3, 2025 01:12
@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from b99f41c to 3bd3ec4 Compare November 3, 2025 01:12
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from 9ab2bae to ab315b2 Compare November 3, 2025 01:13
@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch 2 times, most recently from 6c2c49e to ea23f31 Compare November 3, 2025 01:44
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from ab315b2 to c99fac8 Compare November 3, 2025 01:44
@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from ea23f31 to c01e410 Compare November 3, 2025 08:23
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from c99fac8 to a21c098 Compare November 3, 2025 08:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
core/providers/vertex/vertex.go (3)

180-196: Fix undefined variable err in error handling.

MakeRequestWithContext returns bifrostErr, but lines 182, 192, and 193 reference an undefined err variable. This will cause a compilation error.

Apply this diff to fix the error handling:

 	_, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
+		if bifrostErr.Error != nil && errors.Is(bifrostErr.Error.Error, context.Canceled) {
 			return nil, &schemas.BifrostError{
 				IsBifrostError: false,
 				Error: &schemas.ErrorField{
 					Type:    schemas.Ptr(schemas.RequestCancelled),
 					Message: schemas.ErrRequestCancelled,
-					Error:   err,
+					Error:   bifrostErr.Error.Error,
 				},
 			}
 		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
+		if bifrostErr.Error != nil && (errors.Is(bifrostErr.Error.Error, fasthttp.ErrTimeout) || errors.Is(bifrostErr.Error.Error, context.DeadlineExceeded)) {
+			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, bifrostErr.Error.Error, provider.GetProviderKey())
 		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}

367-383: Fix undefined variable err in error handling.

Same issue as in listModelsByKey: MakeRequestWithContext returns bifrostErr, but lines 369, 379, and 380 reference undefined err.

Apply this diff:

 	latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
+		if bifrostErr.Error != nil && errors.Is(bifrostErr.Error.Error, context.Canceled) {
 			return nil, &schemas.BifrostError{
 				IsBifrostError: false,
 				Error: &schemas.ErrorField{
 					Type:    schemas.Ptr(schemas.RequestCancelled),
 					Message: schemas.ErrRequestCancelled,
-					Error:   err,
+					Error:   bifrostErr.Error.Error,
 				},
 			}
 		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
+		if bifrostErr.Error != nil && (errors.Is(bifrostErr.Error.Error, fasthttp.ErrTimeout) || errors.Is(bifrostErr.Error.Error, context.DeadlineExceeded)) {
+			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, bifrostErr.Error.Error, provider.GetProviderKey())
 		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}

661-677: Fix undefined variable err in error handling.

Same issue as in previous functions: MakeRequestWithContext returns bifrostErr, but lines 663, 673, and 674 reference undefined err.

Apply this diff:

 	latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 	if bifrostErr != nil {
-		if errors.Is(err, context.Canceled) {
+		if bifrostErr.Error != nil && errors.Is(bifrostErr.Error.Error, context.Canceled) {
 			return nil, &schemas.BifrostError{
 				IsBifrostError: false,
 				Error: &schemas.ErrorField{
 					Type:    schemas.Ptr(schemas.RequestCancelled),
 					Message: schemas.ErrRequestCancelled,
-					Error:   err,
+					Error:   bifrostErr.Error.Error,
 				},
 			}
 		}
-		if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
+		if bifrostErr.Error != nil && (errors.Is(bifrostErr.Error.Error, fasthttp.ErrTimeout) || errors.Is(bifrostErr.Error.Error, context.DeadlineExceeded)) {
+			return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, bifrostErr.Error.Error, provider.GetProviderKey())
 		}
-		return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+		return nil, bifrostErr
 	}
core/providers/anthropic/anthropic.go (1)

668-685: Critical: Undefined variable err causes compile error.

Line 668 assigns the error return value to bifrostErr, but lines 671, 681, and 684 reference an undefined variable err. This code will not compile.

Since MakeRequestWithContext already handles context.Canceled, fasthttp.ErrTimeout, and context.DeadlineExceeded internally and returns an appropriately wrapped BifrostError, you should return bifrostErr directly.

Apply this diff:

 _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
 if bifrostErr != nil {
   defer fasthttp.ReleaseResponse(resp)
-  if errors.Is(err, context.Canceled) {
-    return nil, &schemas.BifrostError{
-      IsBifrostError: false,
-      Error: &schemas.ErrorField{
-        Type:    schemas.Ptr(schemas.RequestCancelled),
-        Message: schemas.ErrRequestCancelled,
-        Error:   err,
-      },
-    }
-  }
-  if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) {
-    return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequestTimedOut, err, provider.GetProviderKey())
-  }
-  return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
+  return nil, bifrostErr
 }
♻️ Duplicate comments (18)
core/providers/groq.go (1)

33-36: MaxConnsPerHost of 10000 remains a concern.

This issue was already flagged in a previous review: hard-coding 10000 connections per host poses resource-exhaustion risks and should be reduced to a safer default (512–1024) or made configurable.

core/providers/openrouter.go (1)

34-36: Duplicate: MaxConnsPerHost value already flagged in previous reviews.

This concern about the MaxConnsPerHost: 10000 value being too high has already been raised in previous review comments. Please refer to those comments for the detailed analysis and recommendations to reduce this to a safer default (e.g., 1024-2048) or make it configurable.

core/providers/parasail.go (1)

33-36: MaxConnsPerHost=10000 remains excessively high.

This issue was previously flagged. The value should be reduced to 500-1000 to avoid resource exhaustion.

Makefile (1)

228-228: Duplicate: Correct junit-viewer path references (flagged in past review).

These lines still use incorrect relative paths when the working directory is root. After the cd && command; chain terminates (with a semicolon), subsequent commands execute at the root directory. The relative paths ../../$(TEST_REPORTS_DIR) and ../$(TEST_REPORTS_DIR) are invalid from root.

Compare with line 183 and 322, which correctly use $(TEST_REPORTS_DIR) directly. Apply these fixes:

- junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
+ junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
- junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
+ junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
- junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-all.xml --save=../../$(TEST_REPORTS_DIR)/core-all.html 2>/dev/null || true; \
+ junit-viewer --results=$(TEST_REPORTS_DIR)/core-all.xml --save=$(TEST_REPORTS_DIR)/core-all.html 2>/dev/null || true; \
- junit-viewer --results=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
+ junit-viewer --results=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \

Also applies to: 249-249, 275-275, 303-303

core/providers/cerebras.go (1)

30-36: Previous timeout and connection limit concerns remain unaddressed.

The previous review flagged that using DefaultRequestTimeoutInSeconds for streaming operations will cause premature termination of long-lived streams, and that MaxConnsPerHost: 10000 risks resource exhaustion. This PR consolidates to a single client but doesn't resolve these issues—streaming methods (lines 88, 123) now use this client with short timeouts.

ui/app/page.tsx (1)

6-6: Missing leading slash in redirect path (previously flagged).

This issue was already identified in a previous review. The redirect path should be "/workspace/logs" with a leading slash to ensure an absolute path.

core/providers/vertex/vertex.go (3)

546-549: Non-Claude streaming should use OAuth token, not key.Value.

Lines 547-548 set the Authorization header using key.Value directly, while other Vertex operations (including Claude streaming above) use getAuthTokenSource to fetch an OAuth token. This inconsistency may lead to authentication failures.

Based on learnings


70-76: Cap fasthttp connections for Vertex

Line 73 sets MaxConnsPerHost to 10,000, which is excessive and may lead to resource exhaustion before Vertex's API rate limits are reached.

Based on learnings


174-174: ListModels should use GET, not POST.

Line 174 issues a POST request to the models endpoint, which is a read operation and should use GET per Vertex API documentation.

Based on learnings

core/providers/anthropic/anthropic.go (1)

77-83: MaxConnsPerHost=10000 remains unaddressed from previous review.

This issue was flagged in a previous review as overwhelming sockets and triggering Anthropic throttling. The value should be reduced to ≈500 or made configurable, but it remains at 10000.

core/providers/bedrock/signer.go (1)

193-197: Use actual Content-Length header value, not body length.

The code computes content-length for signing using len(body), but fasthttp may set a different Content-Length header value when the request is sent (or omit it entirely for certain request types). This mismatch causes signature verification failures.

Based on learnings: Instead of len(body), check if fasthttp has already set the header and use that value:

-	// Include content-length if body is present
-	if len(body) > 0 {
-		headerNames = append(headerNames, "content-length")
-		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
-	}
+	// Include content-length if already set by fasthttp
+	if cl := req.Header.ContentLength(); cl >= 0 {
+		headerNames = append(headerNames, "content-length")
+		headerMap["content-length"] = []string{strconv.Itoa(cl)}
+	}

This ensures the signed header matches the actual header sent with the request.

core/providers/azure/azure.go (1)

38-40: Lower MaxConnsPerHost to prevent resource exhaustion.

Setting MaxConnsPerHost to 10,000 risks file-descriptor exhaustion and can overwhelm downstream services. Consider reducing to a safer default (≈500) or making it configurable via NetworkConfig.

Apply this diff:

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

Or make it configurable:

+		maxConns := 500
+		if config.NetworkConfig.MaxConnsPerHost > 0 {
+			maxConns = config.NetworkConfig.MaxConnsPerHost
+		}
 		client := &fasthttp.Client{
 			ReadTimeout:     time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
 			WriteTimeout:    time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-			MaxConnsPerHost:     10000,
+			MaxConnsPerHost:     maxConns,
core/providers/cohere/cohere.go (3)

82-86: Lower MaxConnsPerHost to prevent resource exhaustion.

Setting MaxConnsPerHost to 10,000 risks file-descriptor exhaustion. Reduce to ≈500 or make it configurable.

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

347-347: Remove duplicate Content-Type header setting.

The Content-Type header is set twice: once at line 347 using SetContentType and again at line 353 using Set. The second call is redundant.

Apply this diff:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 
 	// Set any extra headers from network config
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)

Also applies to: 353-353


551-551: Remove duplicate Content-Type header setting.

Similar to ChatCompletionStream, the Content-Type header is set twice: once at line 551 using SetContentType and again at line 555 using Set.

Apply this diff:

 	req.Header.SetMethod(http.MethodPost)
 	req.SetRequestURI(url)
 	req.Header.SetContentType("application/json")
 	providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
 		
 	// Set headers
-	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key.Value)

Also applies to: 555-555

core/providers/gemini/gemini.go (2)

37-39: Lower MaxConnsPerHost to prevent resource exhaustion.

Setting MaxConnsPerHost to 10,000 can overwhelm Gemini's API and exhaust file descriptors. Reduce to ≈500 or make it configurable.

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,

698-700: Release response on HTTP error to prevent leak.

When resp.StatusCode() != fasthttp.StatusOK, the function returns without releasing the response, leaking pooled buffers.

Apply this diff:

 	// Check for HTTP errors
 	if resp.StatusCode() != fasthttp.StatusOK {
+		defer fasthttp.ReleaseResponse(resp)
 		return nil, parseStreamGeminiError(providerName, resp)
 	}
core/providers/openai/openai.go (1)

38-42: Lower MaxConnsPerHost to prevent resource exhaustion.

Setting MaxConnsPerHost to 10,000 risks file-descriptor exhaustion. Reduce to ≈500 or make it configurable.

-		MaxConnsPerHost:     10000,
+		MaxConnsPerHost:     500,
🧹 Nitpick comments (8)
core/providers/parasail.go (1)

38-38: Remove unnecessary empty line.

Apply this diff:

-	
 	// Configure proxy if provided
core/providers/cerebras.go (1)

34-35: Consider making connection pool settings configurable.

MaxIdleConnDuration and MaxConnWaitTimeout are hardcoded. While the current values are reasonable, exposing them via NetworkConfig would allow advanced tuning for different deployment scenarios.

core/providers/vertex/vertex.go (1)

658-659: Remove orphaned comment.

Line 659 contains a duplicate comment about setting extra headers, but SetExtraHeaders was already called on line 643. This appears to be leftover from refactoring.

Apply this diff:

 	req.SetBody(jsonBody)
 
-	// Set any extra headers from network config
-
 	// Make the request
core/providers/bedrock/signer.go (2)

138-154: Consider credential loading performance for high-throughput scenarios.

The synchronous AWS credential loading (config.LoadDefaultConfig and Credentials.Retrieve) can block the request path, especially when credentials are sourced from STS, EC2 metadata, or other network-based providers. For low-to-medium request rates, this is acceptable, but in high-throughput scenarios this could become a bottleneck.

If performance profiling reveals credential loading latency, consider:

  1. Pre-loading credentials during provider initialization
  2. Caching the AWS config and credentials with periodic refresh
  3. Using async credential fetching with a fallback mechanism

177-183: Consider preserving caller-supplied Content-Type and Accept headers.

The code unconditionally sets Content-Type and Accept to application/json, which overwrites any headers the caller may have already configured. While this is appropriate for Bedrock's JSON API, it reduces flexibility if non-JSON content types are needed in the future.

Consider checking if these headers are already set:

-	req.Header.Set("Content-Type", "application/json")
-	req.Header.Set("Accept", "application/json")
+	if len(req.Header.Peek("Content-Type")) == 0 {
+		req.Header.Set("Content-Type", "application/json")
+	}
+	if len(req.Header.Peek("Accept")) == 0 {
+		req.Header.Set("Accept", "application/json")
+	}
core/providers/cohere/cohere.go (1)

402-467: Consider adding context cancellation checks in streaming loops.

The streaming loops in both ChatCompletionStream and ResponsesStream don't check for context cancellation. This can delay handling of client disconnects. Consider adding checks similar to those in openai.go:

 		for scanner.Scan() {
+			// Check if context is done before processing
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
+
 			line := scanner.Text()

Also applies to: 605-681

core/providers/gemini/gemini.go (1)

444-543: Consider adding context cancellation checks in streaming loops.

The streaming loops in both SpeechStream and TranscriptionStream don't check for context cancellation, which can delay handling of client disconnects.

 		for scanner.Scan() {
+			// Check if context is done before processing
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
+
 			line := scanner.Text()

Also applies to: 721-827

core/providers/openai/openai.go (1)

374-374: Consider increasing scanner buffer size for large SSE events.

The default bufio.Scanner token limit (64KB) can cause "token too long" errors when SSE events contain large JSON payloads. Consider increasing the buffer size:

 	scanner := bufio.NewScanner(resp.BodyStream())
+	// Raise max token size to handle large SSE events
+	buf := make([]byte, 0, 64*1024)
+	scanner.Buffer(buf, 10*1024*1024)

Apply this pattern to all streaming handlers (lines 374, 732, 1092, 1452, 1722).

Also applies to: 732-732, 1092-1092, 1452-1452, 1722-1722

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 66b74e9 and a21c098.

⛔ Files ignored due to path filters (5)
  • framework/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (39)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/go.mod (1 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (4 hunks)
  • core/providers/bedrock/chat.go (0 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (7 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (5 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (14 hunks)
  • framework/go.mod (1 hunks)
  • plugins/logging/go.mod (2 hunks)
  • plugins/otel/go.mod (1 hunks)
  • plugins/telemetry/go.mod (2 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • transports/go.mod (0 hunks)
  • ui/app/clientLayout.tsx (1 hunks)
  • ui/app/layout.tsx (1 hunks)
  • ui/app/page.tsx (1 hunks)
  • ui/app/workspace/logs/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1 hunks)
  • ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1 hunks)
  • ui/app/workspace/plugins/views/pluginsView.tsx (2 hunks)
  • ui/app/workspace/providers/page.tsx (2 hunks)
  • ui/components/sidebar.tsx (2 hunks)
  • ui/next.config.ts (1 hunks)
  • ui/tsconfig.json (1 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/bedrock/chat.go
  • transports/go.mod
🚧 Files skipped from review as they are similar to previous changes (14)
  • ui/next.config.ts
  • core/providers/utils/utils.go
  • ui/components/sidebar.tsx
  • ui/app/workspace/plugins/fragments/pluginFormFragement.tsx
  • framework/go.mod
  • ui/app/workspace/plugins/views/pluginsView.tsx
  • ui/app/workspace/mcp-clients/page.tsx
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx
  • core/providers/mistral/mistral.go
  • ui/app/workspace/providers/page.tsx
  • core/providers/sgl.go
  • scheams/go.mod
  • .gitignore
🧰 Additional context used
🧬 Code graph analysis (9)
ui/app/layout.tsx (1)
ui/app/clientLayout.tsx (1)
  • ClientLayout (38-51)
ui/app/clientLayout.tsx (2)
ui/hooks/useWebSocket.tsx (1)
  • WebSocketProvider (26-162)
ui/components/themeProvider.tsx (1)
  • ThemeProvider (6-8)
core/providers/cohere/cohere.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
  • NewProviderAPIError (344-359)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (97-145)
  • NewBifrostOperationError (329-340)
  • SetExtraHeaders (153-174)
  • MakeRequestWithContext (37-92)
  • NewProviderAPIError (344-359)
  • HandleProviderResponse (234-278)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/anthropic/anthropic.go (3)
  • AcquireAnthropicChatResponse (44-48)
  • ReleaseAnthropicChatResponse (51-55)
  • HandleAnthropicChatCompletionStreaming (377-570)
core/providers/openrouter.go (2)
core/schemas/provider.go (2)
  • NetworkConfig (34-42)
  • DefaultRequestTimeoutInSeconds (14-14)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (282-293)
core/providers/openai/openai.go (1)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (153-174)
  • HandleProviderAPIError (208-228)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/anthropic/anthropic.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
  • MakeRequestWithContext (37-92)
  • NewProviderAPIError (344-359)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (329-340)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (38)
plugins/otel/go.mod (1)

10-10: Clarify the necessity of the gRPC version bump.

The PR objective focuses on migrating streaming to fasthttp, but this file only shows a gRPC patch version bump (v1.75.0 → v1.76.0). This change appears unrelated to the stated PR goals. Please clarify whether this bump is necessary for the fasthttp migration or if it was included incidentally.

core/providers/groq.go (1)

179-179: LGTM! Streaming now uses unified fasthttp client.

The change correctly migrates streaming to use provider.client instead of a separate stream client, aligning with the PR's objective to consolidate all HTTP operations under fasthttp.

core/providers/ollama.go (1)

96-96: LGTM! Streaming unified under single fasthttp client.

The refactoring correctly migrates both TextCompletionStream and ChatCompletionStream to use provider.client instead of a separate stream client, aligning with the PR objective to consolidate HTTP handling. The fasthttp client is safe for concurrent use, so this simplification improves maintainability without introducing correctness or concurrency issues.

Also applies to: 131-131

core/providers/openrouter.go (2)

148-148: Excellent refactor: unified client for streaming operations.

The migration from separate streamClient to the unified provider.client for all streaming operations (TextCompletionStream, ChatCompletionStream, ResponsesStream) is clean and consistent. This successfully achieves the PR objective of consolidating HTTP clients to use fasthttp throughout.

Also applies to: 183-183, 214-214


229-229: Well done: dynamic error messages improve maintainability.

Replacing hardcoded "openrouter" strings with string(provider.GetProviderKey()) in all unsupported operation error messages ensures consistency and maintainability. This change correctly addresses the feedback from previous reviews.

Also applies to: 234-234, 239-239, 244-244, 249-249

core/providers/parasail.go (2)

111-111: LGTM! Streaming now correctly uses the unified fasthttp client.

This change successfully consolidates streaming to use provider.client, eliminating the need for a separate stream client as intended by the PR.


17-22: LGTM! Clean migration to unified fasthttp client.

The provider structure correctly consolidates to a single fasthttp.Client, removing the need for separate stream handling. The implementation is consistent and aligns with the PR objective.

ui/app/workspace/logs/page.tsx (1)

3-6: LGTM!

The import path updates correctly reflect the workspace structure refactor. The paths align with the tsconfig.json alias configuration and are consistent with the broader UI reorganization in this PR.

ui/tsconfig.json (1)

29-36: LGTM!

The path alias updates correctly support the workspace structure refactor. The inclusion of fallback paths (./app/workspace/_fallbacks/*) provides a safety net for missing modules, which is a good practice.

ui/app/clientLayout.tsx (2)

15-36: LGTM!

The AppContent component properly handles configuration fetching, error states, and conditional rendering based on database connectivity. The error handling with toast notifications and the fallback UI states (NotAvailableBanner and FullPageLoader) provide good user experience.


38-51: LGTM!

The ClientLayout component properly nests providers in the correct order. The theme configuration with system theme support and the inclusion of Toaster for notifications are implemented correctly.

ui/app/layout.tsx (3)

8-8: Good performance optimization!

Adding display: "swap" to the font configurations ensures text remains visible during font loading, preventing Flash of Invisible Text (FOIT) and improving perceived performance.

Also applies to: 14-14


23-24: Good performance optimization!

Adding preconnect hints for Google Fonts domains helps establish early connections, reducing latency for font loading. The crossOrigin="anonymous" attribute is correctly set for CORS requirements.


27-27: LGTM!

The simplification to use ClientLayout improves separation of concerns. The root layout now focuses on HTML structure and font configuration, while ClientLayout handles provider composition and application-level concerns.

plugins/logging/go.mod (2)

8-8: Sonic patch version bump is safe.

The upgrade from sonic v1.14.0 to v1.14.1 is a patch release, which typically includes bug fixes and performance improvements without breaking changes. This is low-risk.


33-33: Incorrect assumption about gopkg's origin.

The review comment attributes gopkg v0.1.3 to the sonic upgrade, but verification shows that sonic v1.14.1's dependencies do not include gopkg. The dependency is marked as indirect in go.mod but comes from bifrost (or its transitive dependencies), not sonic. The core recommendation to verify gopkg as "an expected consequence of the sonic upgrade" is based on an incorrect premise and should be disregarded.

Likely an incorrect or invalid review comment.

core/providers/vertex/vertex.go (6)

15-16: LGTM! Imports align with the fasthttp migration.

The addition of fasthttp and oauth2 imports supports the refactoring to use fasthttp for all operations and the token-source-based authentication approach.


88-112: LGTM! Token source approach improves authentication flow.

Refactoring from getAuthClient to getAuthTokenSource cleanly separates token acquisition from HTTP client lifecycle, allowing each request to fetch a fresh token and set the Authorization header independently.


150-178: Token acquisition and request setup look correct.

OAuth token is properly fetched and set in the Authorization header for the ListModels request.


343-365: Request setup with OAuth token looks correct.

Token acquisition and Authorization header setup are properly implemented for the ChatCompletion request.


515-524: LGTM! Claude streaming now uses OAuth token.

Streaming authentication for Claude models correctly fetches an OAuth token via getAuthTokenSource and sets it in the Authorization header, aligning with other Vertex operations.


632-656: Request setup with OAuth token looks correct.

Token acquisition and Authorization header are properly configured for the embedding request.

core/providers/anthropic/anthropic.go (1)

377-570: HandleAnthropicChatCompletionStreaming correctly implements fasthttp streaming.

The streaming implementation properly manages fasthttp request/response lifecycle:

  • Response is released in the goroutine after streaming completes (line 443)
  • Error paths correctly defer response release before returning (lines 414, 433)
  • The err variable from client.Do (line 412) is properly scoped and used throughout error handling
tests/core-providers/config/account.go (2)

271-271: LGTM! Consistent application of the concurrency constant.

The constant is applied uniformly across all 15 provider configurations. The implementation is correct and maintains the existing BufferSize values, indicating that provider-specific tuning is preserved where appropriate.

Also applies to: 285-285, 311-311, 324-324, 337-337, 350-350, 363-363, 377-377, 390-390, 403-403, 417-417, 430-430, 443-443, 456-456, 469-469


16-17: Good refactoring: constant extraction and consistent application verified.

The refactoring is sound. The Concurrency constant is properly extracted and consistently applied across all 15 provider configurations. This improves maintainability and eliminates magic numbers.

Standardizing concurrency to a single value (4) for all test providers is reasonable for consistency and aligns with typical test configuration practices. The concurrent handler limit of 4 is conservative for test scenarios and sufficient for the test suite. Other provider-specific tuning (e.g., BufferSize) remains varied where needed.

If the previous values (10, 2, 1) had provider-specific reasons, consider documenting that context in a comment to prevent future changes that might reintroduce per-provider variation without justification.

plugins/telemetry/go.mod (2)

18-31: AWS SDK v2 patch version bump maintains consistency across submodules.

The coordinated update of the AWS SDK v2 family from v1.39.2 to v1.39.3 (lines 18–31) is good dependency hygiene, ensuring all related modules stay in sync. The smithy-go v1.23.1 is correctly marked as indirect here (line 31), matching the version now declared as direct in core/go.mod.


109-110: I'll search for information about grpc v1.76.0 to confirm compatibility and check for breaking changes.

Review comment is inaccurate; grpc version bump is intentional, not outside PR scope.

gRPC v1.76.0 contains refinements, improvements, and bug fixes. The xDS ClusterResolverLb conversion resolves unnecessary reconnections introduced in v1.75.0 when using weighted_round_robin, and a Netty version detection fix in v1.76.0 resolves Netty 4.1.111 corruption that v1.75.0 exposed—these are targeted bug fixes, not breaking changes.

The git history shows commit 39bfdde added "gRPC support for Weaviate and bump versions," confirming the grpc v1.76.0 update is deliberate and part of this PR's scope, not routine maintenance outside the fasthttp migration. The genproto snapshot update is a transitive dependency automatically resolved by Go modules alongside the grpc update. No compatibility concerns identified.

core/go.mod (1)

11-11: smithy-go v1.23.1 promotion is justified and verified as compatible.

The promotion of github.com/aws/smithy-go v1.23.1 to a direct dependency in core/go.mod is appropriate and minimal. Verification confirms:

  • Version is consistent across all modules at v1.23.1 (with tests/core-chatbot's v1.22.5 automatically resolved to v1.23.1 via module inheritance)
  • No documented breaking changes exist between v1.22.5 and v1.23.1
  • Module graph is consistent and properly structured
core/providers/bedrock/signer.go (7)

56-70: LGTM! Standard AWS SigV4 key derivation.

The HMAC-SHA256 computation and AWS signing key derivation follow the official AWS Signature Version 4 specification correctly.


101-125: LGTM! Correct header value normalization.

The space-stripping logic correctly implements AWS SigV4 header canonicalization requirements by trimming and collapsing excess spaces. The use of strings.Builder with pre-allocation is efficient.


156-171: LGTM! Correct AWS SigV4 timestamp and URI handling.

The time formatting, URI parsing, and path escaping follow AWS Signature Version 4 requirements correctly.


199-217: LGTM! Correct header iteration using fasthttp All() method.

The header collection logic correctly uses req.Header.All() (available in fasthttp v1.63.0+) and properly filters ignored headers and duplicates.

Based on learnings.


219-240: LGTM! Correct AWS SigV4 canonical headers construction.

The canonical headers string is built according to AWS Signature Version 4 specification: sorted header names, normalized values with excess spaces removed, and proper formatting.


261-286: LGTM! Correct AWS SigV4 canonical request and string-to-sign construction.

The canonical request and string-to-sign follow the AWS Signature Version 4 specification correctly, with proper formatting, hashing, and credential scope construction.


288-303: LGTM! Correct signature calculation and authorization header.

The signature computation using the cached signing key and the Authorization header construction follow the AWS Signature Version 4 specification correctly.

core/providers/azure/azure.go (1)

285-285: LGTM: Streaming methods correctly migrated to fasthttp.

All streaming methods now correctly pass provider.client to the shared streaming handlers, eliminating the need for a separate stream client.

Also applies to: 370-370, 495-495

core/providers/openai/openai.go (2)

385-389: LGTM: Context cancellation checks added.

The context cancellation checks in all streaming loops properly handle client disconnects and prevent unnecessary processing.

Also applies to: 743-748, 1098-1103, 1459-1464, 1729-1734


269-269: LGTM: Fasthttp migration executed correctly.

The migration to fasthttp is well-structured with:

  • Consistent handler signature updates
  • Proper request/response lifecycle management
  • Updated error handling for fasthttp types

Also applies to: 285-285, 626-626, 642-642, 984-984, 1001-1001

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
ui/components/sidebar.tsx (1)

303-303: WebSocket indicator check is broken after route migration.

The hardcoded check item.url === "/logs" no longer matches since the URL was updated to /workspace/logs. This prevents the green animated pulse indicator from displaying when the WebSocket is connected.

Apply this diff to fix the check:

-							{!hasSubItems && item.url === "/logs" && isWebSocketConnected && (
+							{!hasSubItems && item.url === "/workspace/logs" && isWebSocketConnected && (
core/providers/vertex/vertex.go (3)

180-196: Wrong error variable; return bifrostErr directly

You check/return using err (from earlier scope) instead of bifrostErr from MakeRequestWithContext.

- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

366-383: Same bug: wrong error variable after MakeRequestWithContext

Use bifrostErr; do not reference err here.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }

632-677: Same bug in Embedding: wrong error variable after MakeRequestWithContext

Return bifrostErr directly.

- latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ latency, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   return nil, bifrostErr
+ }
core/providers/gemini/gemini.go (1)

944-969: Compile-time bug: use StatusCode() method, not field

resp.StatusCode is a method, not an int field.

- return providerUtils.NewBifrostOperationError(fmt.Sprintf("Gemini streaming error (HTTP %d): %v", resp.StatusCode, rawResponse), fmt.Errorf("HTTP %d", resp.StatusCode), providerName)
+ return providerUtils.NewBifrostOperationError(
+   fmt.Sprintf("Gemini streaming error (HTTP %d): %v", resp.StatusCode(), rawResponse),
+   fmt.Errorf("HTTP %d", resp.StatusCode()),
+   providerName)
core/providers/anthropic/anthropic.go (1)

667-685: Wrong error variable after MakeRequestWithContext

Use bifrostErr; do not reference err here.

- _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
- if bifrostErr != nil {
-   defer fasthttp.ReleaseResponse(resp)
-   if errors.Is(err, context.Canceled) { ... }
-   if errors.Is(err, fasthttp.ErrTimeout) || errors.Is(err, context.DeadlineExceeded) { ... }
-   return nil, providerUtils.NewBifrostOperationError(schemas.ErrProviderRequest, err, provider.GetProviderKey())
- }
+ _, bifrostErr := providerUtils.MakeRequestWithContext(ctx, provider.client, req, resp)
+ if bifrostErr != nil {
+   fasthttp.ReleaseResponse(resp)
+   return nil, bifrostErr
+ }
core/providers/openai/openai.go (1)

1350-1366: Use-after-free risk: copy audio bytes before releasing resp

resp.Body() becomes invalid after releasing the response. Copy before return.

- audioData := resp.Body()
+ audioData := append([]byte(nil), resp.Body()...)
♻️ Duplicate comments (17)
core/providers/openrouter.go (1)

34-34: MaxConnsPerHost value remains excessive (duplicate concern).

This value of 10000 has been flagged in previous reviews as potentially causing resource exhaustion. Past reviews recommended reducing to 1024-2048 or making it configurable. While this is part of a broader pattern across all providers, it remains a concern for production stability.

ui/app/page.tsx (1)

6-6: Missing leading slash in redirect path.

The redirect path should be "/workspace/logs" with a leading slash for absolute path clarity.

Apply this diff:

-	redirect("workspace/logs");
+	redirect("/workspace/logs");
core/providers/bedrock/signer.go (1)

193-197: Content-Length in signed headers may drift from actual request header.

This issue was flagged in a previous review but not yet addressed. The signed headers include "content-length": strconv.Itoa(len(body)), but fasthttp sets the actual Content-Length header when the request is sent. If fasthttp determines a different content length (e.g., due to chunked encoding, compression, or internal buffering), the signature will be invalid, causing authentication failures.

Apply this fix to use the content-length that fasthttp will actually send:

-	// Include content-length if body is present
-	if len(body) > 0 {
+	// Include content-length if fasthttp has set it
+	if cl := req.Header.ContentLength(); cl >= 0 {
 		headerNames = append(headerNames, "content-length")
-		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
+		headerMap["content-length"] = []string{strconv.Itoa(cl)}
 	}

Alternatively, explicitly set the Content-Length header before this block:

+	// Set content-length explicitly so it's included in signature
+	if len(body) > 0 {
+		req.Header.SetContentLength(len(body))
+	}
+
 	// Include content-length if body is present
 	if len(body) > 0 {
 		headerNames = append(headerNames, "content-length")
 		headerMap["content-length"] = []string{strconv.Itoa(len(body))}
 	}
Makefile (2)

228-228: ⚠️ Relative paths in junit-viewer invocations are incorrect — regression from previous review.

After the cd tests/core-providers && gotestsum ... ; command completes, the semicolon ends that command and subsequent commands (like the if statement) execute from the root directory, not from tests/core-providers/. Therefore, ../../$(TEST_REPORTS_DIR) resolves incorrectly from the root.

These paths were flagged in a previous review (marked as addressed), but the issue persists:

  • Line 228: should be $(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml
  • Line 249: should be $(TEST_REPORTS_DIR)/core-$(PROVIDER).xml
  • Line 275: should be $(TEST_REPORTS_DIR)/core-all.xml

Apply this diff to fix the paths in lines 228, 249, and 275:

-					junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER)-$(TESTCASE).html 2>/dev/null || true; \
-					junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=../../$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/core-$(PROVIDER).xml --save=$(TEST_REPORTS_DIR)/core-$(PROVIDER).html 2>/dev/null || true; \
-				junit-viewer --results=../../$(TEST_REPORTS_DIR)/core-all.xml --save=../../$(TEST_REPORTS_DIR)/core-all.html 2>/dev/null || true; \
+				junit-viewer --results=$(TEST_REPORTS_DIR)/core-all.xml --save=$(TEST_REPORTS_DIR)/core-all.html 2>/dev/null || true; \

Also applies to: 249-249, 275-275


303-303: ⚠️ junit-viewer path in test-plugins is incorrect.

Similar to the test-core issues, the junit-viewer invocation at line 303 executes from the root directory (after the cd - returns in the preceding && chain). The path ../$(TEST_REPORTS_DIR) is incorrect; it should be $(TEST_REPORTS_DIR).

Apply this diff to fix line 303:

-					junit-viewer --results=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=../$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
+					junit-viewer --results=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.xml --save=$(TEST_REPORTS_DIR)/plugin-$$plugin_name.html 2>/dev/null || true; \
core/providers/vertex/vertex.go (3)

546-555: Non-Claude streaming must use OAuth token, not key.Value

Align with other Vertex calls to avoid 401s.

- authHeader := map[string]string{}
- if key.Value != "" {
-   authHeader["Authorization"] = "Bearer " + key.Value
- }
+ authHeader := map[string]string{}
+ tokenSource, err := getAuthTokenSource(key)
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error creating auth token source", err, schemas.Vertex)
+ }
+ token, err := tokenSource.Token()
+ if err != nil {
+   return nil, providerUtils.NewBifrostOperationError("error getting token", err, schemas.Vertex)
+ }
+ authHeader["Authorization"] = "Bearer " + token.AccessToken

70-76: Reduce MaxConnsPerHost and use streaming-safe timeouts

10k per-host conns + request-style timeouts are dangerous for SSE and infra.

 client := &fasthttp.Client{
-  ReadTimeout:         time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  WriteTimeout:        time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  MaxConnsPerHost:     10000,
+  ReadTimeout:         0,
+  WriteTimeout:        0,
+  MaxConnsPerHost:     512,
   MaxIdleConnDuration: 60 * time.Second,
   MaxConnWaitTimeout:  10 * time.Second,
 }

174-181: ListModels must use GET (not POST)

POST will likely 405/400 on Vertex models.list.

- req.Header.SetMethod(http.MethodPost)
+ req.Header.SetMethod(http.MethodGet)
- req.Header.SetContentType("application/json")
+ req.Header.Del("Content-Type")
core/providers/openai/openai.go (2)

374-376: Raise bufio.Scanner limits for SSE

Prevent “token too long” on large chunks.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

Also applies to: 732-733, 1092-1092, 1452-1453, 1722-1722


1663-1687: TranscriptionStream: preserve multipart Content-Type and deduplicate header setup

Currently overrides multipart with application/json and applies extra headers twice.

- req := fasthttp.AcquireRequest()
+ req := fasthttp.AcquireRequest()
   resp := fasthttp.AcquireResponse()
   resp.StreamBody = true
   defer fasthttp.ReleaseRequest(req)
@@
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
- req.Header.SetMethod(http.MethodPost)
- req.SetRequestURI(url)
- req.Header.SetContentType("application/json")
-
- // Set any extra headers from network config
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+ req.Header.SetMethod(http.MethodPost)
+ req.SetRequestURI(url)
+ // Apply provider extra headers once
+ providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
@@
- req.SetBody(body.Bytes())
+ req.SetBody(body.Bytes())
core/providers/cerebras.go (1)

31-36: Streaming-safe timeouts and connection cap

Using DefaultRequestTimeout for Read/Write will prematurely kill long-lived SSE streams; MaxConnsPerHost=10000 risks FD exhaustion. Set streaming-safe timeouts (0) or separate config, and lower the conn cap (~512) or make it configurable.

Apply:

 client := &fasthttp.Client{
-  ReadTimeout:     time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  WriteTimeout:    time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  MaxConnsPerHost:     10000,
+  // Disable per-connection read/write timeouts to support long SSE
+  ReadTimeout:         0,
+  WriteTimeout:        0,
+  // Safer default; consider making configurable
+  MaxConnsPerHost:     512,
   MaxIdleConnDuration: 60 * time.Second,
   MaxConnWaitTimeout:  10 * time.Second,
 }
core/providers/cohere/cohere.go (3)

81-87: Lower connection cap and use streaming-safe timeouts

Avoid 10k per-host connections and request-style timeouts for SSE.

 client := &fasthttp.Client{
-  ReadTimeout:         time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  WriteTimeout:        time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  MaxConnsPerHost:     10000,
+  ReadTimeout:         0,
+  WriteTimeout:        0,
+  MaxConnsPerHost:     512,
   MaxIdleConnDuration: 60 * time.Second,
   MaxConnWaitTimeout:  10 * time.Second,
 }

338-360: Fix duplicate Content-Type, raise scanner limits, and correct chunk index

  • Remove redundant Content-Type (already set via SetContentType).
  • Increase bufio.Scanner buffer to avoid 64KB token errors.
  • Avoid double-incrementing chunkIndex (currently increments at Lines 427 and 445).
@@
- req.Header.SetContentType("application/json")
+ req.Header.SetContentType("application/json")
@@
- // Set any extra headers from network config
+ // Set any extra headers from network config
  providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
@@
- // Set headers
- req.Header.Set("Content-Type", "application/json")
+ // Set headers
  req.Header.Set("Authorization", "Bearer "+key.Value)
  req.Header.Set("Accept", "text/event-stream")
  req.Header.Set("Cache-Control", "no-cache")
@@
- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)
@@
- chunkIndex++
- ...
- lastChunkTime = time.Now()
- chunkIndex++
+ chunkIndex++
+ ...
+ lastChunkTime = time.Now()

Also applies to: 394-451


543-560: ResponsesStream: remove duplicate Content-Type and raise scanner buffer

Same issues as chat streaming: duplicate Content-Type and default 64KB scanner limit.

@@
- req.Header.SetContentType("application/json")
- providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
+ req.Header.SetContentType("application/json")
+ providerUtils.SetExtraHeaders(req, provider.networkConfig.ExtraHeaders, nil)
@@
- // Set headers
- req.Header.Set("Content-Type", "application/json")
+ // Set headers
  req.Header.Set("Authorization", "Bearer "+key.Value)
  req.Header.Set("Accept", "text/event-stream")
  req.Header.Set("Cache-Control", "no-cache")
@@
- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)

Also applies to: 596-671

core/providers/gemini/gemini.go (2)

34-40: Streaming-safe timeouts and safer conn cap

Set Read/Write to 0 for SSE and reduce MaxConnsPerHost (~512) or make configurable.

 client := &fasthttp.Client{
-  ReadTimeout:     time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  WriteTimeout:    time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  MaxConnsPerHost:     10000,
+  ReadTimeout:         0,
+  WriteTimeout:        0,
+  MaxConnsPerHost:     512,
   MaxIdleConnDuration: 60 * time.Second,
   MaxConnWaitTimeout:  10 * time.Second,
 }

698-700: Release pooled response on HTTP error

Avoid leaking fasthttp.Response on non-200.

- if resp.StatusCode() != fasthttp.StatusOK {
-   return nil, parseStreamGeminiError(providerName, resp)
- }
+ if resp.StatusCode() != fasthttp.StatusOK {
+   err := parseStreamGeminiError(providerName, resp)
+   fasthttp.ReleaseResponse(resp)
+   return nil, err
+ }
core/providers/anthropic/anthropic.go (1)

77-83: Lower MaxConnsPerHost; zero out timeouts for SSE

Same rationale as other providers: streaming needs 0 read/write timeouts; avoid 10k per-host conns.

 client := &fasthttp.Client{
-  ReadTimeout:         time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  WriteTimeout:        time.Second * time.Duration(config.NetworkConfig.DefaultRequestTimeoutInSeconds),
-  MaxConnsPerHost:     10000,
+  ReadTimeout:         0,
+  WriteTimeout:        0,
+  MaxConnsPerHost:     512,
   MaxIdleConnDuration: 60 * time.Second,
   MaxConnWaitTimeout:  10 * time.Second,
 }
🧹 Nitpick comments (8)
core/providers/groq.go (1)

39-42: Remove commented-out dead code.

This commented-out code references groqResponsePool which doesn't appear to be used anywhere in the current implementation. Dead code should be removed to keep the codebase clean.

Apply this diff to remove the dead code:

-	// // Pre-warm response pools
-	// for range config.ConcurrencyAndBufferSize.Concurrency {
-	// 	groqResponsePool.Put(&schemas.BifrostResponse{})
-	// }
-
core/providers/bedrock/signer.go (1)

138-154: Credential loading on every request adds unnecessary overhead.

config.LoadDefaultConfig performs file I/O, environment variable parsing, and potentially network calls to credential providers (IMDS, STS, etc.). Calling this on every signing request when credentials aren't provided introduces significant latency.

Move credential resolution to provider initialization:

In the Bedrock provider struct (caller of this function):

type BedrockProvider struct {
    // ... existing fields
    awsAccessKey   string
    awsSecretKey   string
    awsSessionToken *string
    region         string
}

func NewBedrockProvider(ctx context.Context, region string) (*BedrockProvider, error) {
    cfg, err := config.LoadDefaultConfig(ctx, config.WithRegion(region))
    if err != nil {
        return nil, err
    }
    creds, err := cfg.Credentials.Retrieve(ctx)
    if err != nil {
        return nil, err
    }
    
    var sessionToken *string
    if creds.SessionToken != "" {
        st := creds.SessionToken
        sessionToken = &st
    }
    
    return &BedrockProvider{
        awsAccessKey:    creds.AccessKeyID,
        awsSecretKey:    creds.SecretAccessKey,
        awsSessionToken: sessionToken,
        region:          region,
    }, nil
}

Then require accessKey and secretKey parameters in signAWSRequestFastHTTP to always be provided by the caller.

Makefile (2)

173-199: ✅ LGTM — test target enhancements enable structured reporting.

The target correctly uses gotestsum and handles HTML generation gracefully with CI detection.

However, the test target body (26 lines) exceeds the checkmake recommended limit of 5 lines. While the added reporting features justify the complexity, consider extracting the conditional HTML generation logic into a dedicated helper target or script for better maintainability.


201-287: Optional refactor: Extract duplicated HTML generation logic in test-core.

The test-core target contains nested conditionals (3+ levels deep) with repeated HTML generation patterns appearing three times (lines 225–239, 246–260, 272–286). Consider extracting this into a helper target like generate-test-report that accepts report parameters to reduce duplication and improve maintainability.

This is a minor concern; functional behavior is correct once path issues (noted above) are fixed.

core/providers/vertex/vertex.go (2)

34-55: Remove unused client pool and helpers

vertexClientPool/getClientKey/removeVertexClient remain but no longer back a cache; delete to reduce confusion and dead code.


146-148: Latency is never accumulated

totalLatency is declared but unused; sum per-page latency or remove the field.

Also applies to: 236-237

core/providers/anthropic/anthropic.go (1)

455-456: Increase scanner token limit for SSE

Default 64KB can truncate long lines. Raise to ~10MB.

- scanner := bufio.NewScanner(resp.BodyStream())
+ scanner := bufio.NewScanner(resp.BodyStream())
+ buf := make([]byte, 0, 64*1024)
+ scanner.Buffer(buf, 10*1024*1024)
core/providers/openai/openai.go (1)

321-339: Optional: avoid duplicate Content-Type assignments

You set Content-Type via SetContentType and again in headers. Keep one to avoid duplicate header values.

Also applies to: 684-696, 1039-1057, 1395-1417

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 66b74e9 and a21c098.

⛔ Files ignored due to path filters (5)
  • framework/go.sum is excluded by !**/*.sum
  • plugins/logging/go.sum is excluded by !**/*.sum
  • plugins/otel/go.sum is excluded by !**/*.sum
  • plugins/telemetry/go.sum is excluded by !**/*.sum
  • transports/go.sum is excluded by !**/*.sum
📒 Files selected for processing (39)
  • .gitignore (1 hunks)
  • Makefile (3 hunks)
  • core/go.mod (1 hunks)
  • core/providers/anthropic/anthropic.go (10 hunks)
  • core/providers/azure/azure.go (4 hunks)
  • core/providers/bedrock/chat.go (0 hunks)
  • core/providers/bedrock/signer.go (1 hunks)
  • core/providers/cerebras.go (5 hunks)
  • core/providers/cohere/cohere.go (7 hunks)
  • core/providers/gemini/gemini.go (9 hunks)
  • core/providers/groq.go (2 hunks)
  • core/providers/mistral/mistral.go (2 hunks)
  • core/providers/ollama.go (3 hunks)
  • core/providers/openai/openai.go (25 hunks)
  • core/providers/openrouter.go (5 hunks)
  • core/providers/parasail.go (2 hunks)
  • core/providers/sgl.go (5 hunks)
  • core/providers/utils/utils.go (1 hunks)
  • core/providers/vertex/vertex.go (14 hunks)
  • framework/go.mod (1 hunks)
  • plugins/logging/go.mod (2 hunks)
  • plugins/otel/go.mod (1 hunks)
  • plugins/telemetry/go.mod (2 hunks)
  • scheams/go.mod (1 hunks)
  • tests/core-providers/config/account.go (16 hunks)
  • transports/go.mod (0 hunks)
  • ui/app/clientLayout.tsx (1 hunks)
  • ui/app/layout.tsx (1 hunks)
  • ui/app/page.tsx (1 hunks)
  • ui/app/workspace/logs/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/page.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx (1 hunks)
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx (1 hunks)
  • ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1 hunks)
  • ui/app/workspace/plugins/views/pluginsView.tsx (2 hunks)
  • ui/app/workspace/providers/page.tsx (2 hunks)
  • ui/components/sidebar.tsx (2 hunks)
  • ui/next.config.ts (1 hunks)
  • ui/tsconfig.json (1 hunks)
💤 Files with no reviewable changes (2)
  • core/providers/bedrock/chat.go
  • transports/go.mod
🚧 Files skipped from review as they are similar to previous changes (20)
  • scheams/go.mod
  • ui/app/workspace/mcp-clients/views/mcpClientSheet.tsx
  • ui/app/workspace/mcp-clients/views/mcpClientsTable.tsx
  • ui/tsconfig.json
  • plugins/logging/go.mod
  • ui/app/workspace/mcp-clients/page.tsx
  • core/providers/parasail.go
  • ui/app/workspace/providers/page.tsx
  • core/go.mod
  • ui/app/workspace/logs/page.tsx
  • core/providers/mistral/mistral.go
  • tests/core-providers/config/account.go
  • core/providers/utils/utils.go
  • framework/go.mod
  • core/providers/sgl.go
  • core/providers/ollama.go
  • ui/app/layout.tsx
  • .gitignore
  • ui/app/workspace/plugins/views/pluginsView.tsx
  • plugins/otel/go.mod
🧰 Additional context used
🧬 Code graph analysis (9)
core/providers/openrouter.go (2)
core/schemas/provider.go (2)
  • NetworkConfig (34-42)
  • DefaultRequestTimeoutInSeconds (14-14)
core/providers/utils/utils.go (1)
  • NewUnsupportedOperationError (282-293)
ui/app/clientLayout.tsx (2)
ui/hooks/useWebSocket.tsx (1)
  • WebSocketProvider (26-162)
ui/components/themeProvider.tsx (1)
  • ThemeProvider (6-8)
core/providers/cohere/cohere.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
  • NewProviderAPIError (344-359)
core/providers/bedrock/signer.go (1)
core/providers/utils/utils.go (1)
  • NewBifrostOperationError (329-340)
core/providers/gemini/gemini.go (3)
core/providers/utils/utils.go (2)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
core/schemas/provider.go (2)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/gemini/types.go (1)
  • GeminiGenerationError (1239-1251)
core/providers/anthropic/anthropic.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (4)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
  • MakeRequestWithContext (37-92)
  • NewProviderAPIError (344-359)
core/providers/openai/openai.go (2)
core/schemas/provider.go (3)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
core/providers/utils/utils.go (3)
  • SetExtraHeaders (153-174)
  • NewBifrostOperationError (329-340)
  • HandleProviderAPIError (208-228)
core/providers/vertex/vertex.go (4)
core/schemas/provider.go (4)
  • DefaultRequestTimeoutInSeconds (14-14)
  • ErrProviderRequestTimedOut (22-22)
  • ErrProviderRequest (24-24)
  • ErrProviderResponseUnmarshal (25-25)
core/providers/utils/utils.go (6)
  • ConfigureProxy (97-145)
  • NewBifrostOperationError (329-340)
  • SetExtraHeaders (153-174)
  • MakeRequestWithContext (37-92)
  • NewProviderAPIError (344-359)
  • HandleProviderResponse (234-278)
core/providers/vertex/types.go (2)
  • VertexListModelsResponse (61-64)
  • VertexEmbeddingResponse (44-46)
core/providers/anthropic/anthropic.go (1)
  • HandleAnthropicChatCompletionStreaming (377-570)
ui/next.config.ts (1)
ui/scripts/fix-paths.js (1)
  • path (4-4)
🪛 checkmake (0.2.2)
Makefile

[warning] 173-173: Target body for "test" exceeds allowed length of 5 (26).

(maxbodylength)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
  • GitHub Check: Graphite / mergeability_check
🔇 Additional comments (17)
ui/app/workspace/plugins/fragments/pluginFormFragement.tsx (1)

3-3: Approved: Import path migration is complete and consistent.

Verification confirms no remaining old imports from @/app/logs/views/codeEditor exist. The change in pluginFormFragement.tsx aligns with the completed workspace-based reorganization, and the new import path @/app/workspace/logs/views/codeEditor is used consistently across all files that reference CodeEditor from outside the logs directory.

core/providers/openrouter.go (2)

148-148: Excellent consolidation to single fasthttp client.

The refactor successfully eliminates the separate streamClient and uses the shared provider.client for all streaming operations. This simplifies the code and achieves the PR objective of using fasthttp consistently across both streaming and non-streaming paths.

Also applies to: 183-183, 214-214


229-229: Good practice using dynamic provider key in error messages.

Using string(provider.GetProviderKey()) instead of hardcoded strings ensures error messages remain accurate and improves maintainability. This pattern is applied consistently across all unsupported operations.

Also applies to: 234-234, 239-239, 244-244, 249-249

core/providers/groq.go (1)

175-189: LGTM! Successfully consolidated to single fasthttp client.

The change correctly implements the PR objective by using provider.client for streaming instead of a separate streamClient. This consolidation simplifies the provider implementation while maintaining the same streaming functionality through the shared OpenAI-compatible handler.

ui/next.config.ts (1)

26-31: LGTM! Path aliases correctly updated for workspace structure.

The webpack alias configuration properly updates both @enterprise and @schemas paths to the new workspace-based directory structure using path.join() for cross-platform compatibility.

ui/app/clientLayout.tsx (1)

1-52: LGTM! Well-structured client layout with appropriate provider composition.

The implementation correctly:

  • Uses the 'use client' directive for client-side functionality
  • Handles core config fetching with proper error handling via toast notifications
  • Implements conditional rendering logic based on database connection status
  • Composes providers in the correct order (ProgressProvider → ThemeProvider → ReduxProvider → NuqsAdapter → AppContent)
ui/components/sidebar.tsx (1)

73-191: Route migration to workspace paths is consistent.

All navigation URLs have been correctly updated to use the /workspace prefix, maintaining consistency across main items and subItems.

core/providers/bedrock/signer.go (2)

101-125: LGTM! Clean implementation of AWS SigV4 header value normalization.

The stripExcessSpaces function correctly implements the AWS SigV4 requirement to trim and collapse whitespace in header values. The early-return optimization for strings without double spaces is a nice touch.


127-304: Well-structured AWS SigV4 implementation for fasthttp.

The signAWSRequestFastHTTP function correctly implements the AWS Signature Version 4 signing process and integrates well with the fasthttp API. The signing flow properly follows the canonical request → string to sign → signature calculation sequence, uses appropriate AWS SDK utilities (like httpbinding.EscapePath), and maintains consistency with the rest of the codebase by returning BifrostError.

A few critical and major issues flagged in separate comments should be addressed, but the overall structure and logic are sound.

Makefile (5)

37-43: ✅ LGTM — Help text additions are clear and helpful.

The new environment variable documentation is well-formatted and consistent with the existing style.


62-82: ✅ LGTM — Install targets with defensive CI detection are well-implemented.

The install-gotestsum and install-junit-viewer targets follow appropriate patterns: dependency checking, graceful installation, and CI-aware behavior.


143-143: ✅ LGTM — Clean targets appropriately segregated.

The clean target now handles test reports, and the new clean-test-reports target provides selective cleanup.

Also applies to: 146-149


151-171: ✅ LGTM — generate-html-reports target is well-guarded and user-friendly.

The target includes appropriate validation, informative error messages, and helpful output formatting.


314-337: junit-viewer correctly supports directory input at line 322.

The code appropriately uses junit-viewer --results=$(TEST_REPORTS_DIR) to pass a directory, and junit-viewer is confirmed to accept either a single XML file or a folder (including subfolders) for the --results argument. No issues found.

plugins/telemetry/go.mod (3)

109-110: Verify Go version requirement and run integration tests for gRPC v1.76.0 with genproto snapshot.

gRPC v1.76.0 (Oct 2025) requires Go ≥ 1.24. While there is no inherent API lock between grpc-go and go-genproto, go-genproto snapshots have occasionally caused transitive incompatibilities through go.mod/toolchain changes. Before merging:

  • Confirm the telemetry plugin's Go toolchain is version 1.24 or later.
  • Update both google.golang.org/grpc and google.golang.org/genproto together and run your build/tests — some genproto snapshots may change package paths or add go.mod constraints that require small adjustments.

3-3: ****

Go 1.24.0 is a valid Go release, and the go directive syntax "go 1.24.0" is valid in go.mod files. The original review comment incorrectly flagged the version syntax as potentially problematic. The go.mod file is correct as written.

Likely an incorrect or invalid review comment.


18-31: Verification complete: AWS SDK for Go v2 v1.39.3 is confirmed stable and includes SigV4 support.

v1.39.3 is a published release of the AWS SDK for Go v2, and AWS SDK for Go v2 (including v1.39.3) supports AWS Signature Version 4 signing as the default auth scheme for most services. The PR's dependency updates align with the stated objective of adding SigV4 signing support.

@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from a21c098 to d475f2b Compare November 3, 2025 09:17
@akshaydeo akshaydeo force-pushed the 10-26-separate_schema_package_wip branch from c01e410 to 49fd56d Compare November 3, 2025 09:17
@akshaydeo akshaydeo force-pushed the 10-27-move_all_providers_to_use_fasthttp_even_for_streaming branch from a6298d1 to ddcfbfa Compare November 3, 2025 09:34
@akshaydeo akshaydeo closed this Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants