-
Notifications
You must be signed in to change notification settings - Fork 33
Add Model based load balancing policies #645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces two new load-balancing policies for LLM models: Model Round-Robin and Model Weighted Round-Robin. The implementation includes comprehensive documentation, YAML policy definitions, Go implementations with request/response handling and model suspension logic, and updates to the gateway controller transformer to embed template parameters into policy configurations. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Gateway as Gateway<br/>(Policy)
participant ModelSelector as Model Selection<br/>Logic
participant RequestModifier as Request<br/>Modifier
participant Models as LLM Models
Client->>Gateway: HTTP Request
Gateway->>ModelSelector: Identify model from request<br/>(payload/header/query/path)
ModelSelector->>ModelSelector: Extract model identifier<br/>using configured location
ModelSelector->>ModelSelector: Select next available model<br/>(round-robin or weighted)
ModelSelector-->>Gateway: Selected model
Gateway->>RequestModifier: Route to selected model
RequestModifier->>RequestModifier: Modify request<br/>(update payload/header/query/path)
RequestModifier-->>Gateway: Modified request
Gateway->>Models: Forward to selected model
Models-->>Gateway: Response (status, body)
Gateway->>Gateway: Check response status<br/>(>= 500 or 429?)
alt Failure Response
Gateway->>ModelSelector: Suspend failed model<br/>for configured duration
ModelSelector->>ModelSelector: Mark model unavailable
end
Gateway-->>Client: Return response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @docs/ai-gateway/llm/load-balancing/model-weighted-round-robin.md:
- Line 32: Fix the typo in the documentation table: replace "atleast 1" with "at
least 1" in the `weight` description (the table cell that reads "Must be atleast
1."). Update the sentence so it reads "Must be at least 1." ensuring spacing and
capitalization match the surrounding style.
In @gateway/policies/model-round-robin/v0.1.0/roundrobin.go:
- Around line 458-489: modifyModelInPayload currently swallows JSON
unmarshal/set/marshal errors and returns empty modifications while
MetadataKeySelectedModel may already be set, causing inconsistency; change
modifyModelInPayload to return an explicit failure RequestAction when any of the
JSON operations fail (instead of policy.UpstreamRequestModifications{}), e.g.,
construct and return a failure response with a 500 status and explanatory
message so the request is aborted/handled upstream; reference the
modifyModelInPayload function and ensure downstream code (e.g., OnResponse logic
that checks MetadataKeySelectedModel) can rely on the modification either
succeeding or the request having been failed.
In @gateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.go:
- Line 12: Fix the typo in the license header of weightedroundrobin.go by
inserting the missing words "express or" so the sentence reads "WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied."; update the
header text in the file's top comment block (same block containing "WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either implied.") to match the correct
phrasing used in the round-robin policy.
🧹 Nitpick comments (10)
gateway/gateway-controller/default-policies/model-round-robin-v0.1.0.yaml (2)
21-24: Consider strengthening the endpoint URL validation pattern.The current pattern
'^https?://'only validates the URL protocol prefix, potentially allowing malformed URLs likehttp://orhttps://invalid. Consider using a more comprehensive URL validation pattern or relying on runtime validation.♻️ Suggested enhancement
endpoint: type: string description: The endpoint URL for the model (optional) - pattern: '^https?://' + pattern: '^https?://[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*'Alternatively, document that full URL validation occurs at runtime.
28-34: Consider if suspendDuration=0 is the right default for production.The default of 0 means failed model invocations won't be tracked or suspended, potentially causing repeated failures to the same model. For production environments, a non-zero default (e.g., 60 or 300 seconds) might provide better resilience.
gateway/gateway-controller/default-policies/model-weighted-round-robin-v0.1.0.yaml (3)
22-25: Consider strengthening the endpoint URL validation pattern.Similar to model-round-robin, the pattern
'^https?://'only validates the URL protocol prefix. Consider using more comprehensive URL validation or document that runtime validation handles malformed URLs.
26-32: Consider allowing weight=0 for temporarily disabling models.The minimum weight of 1 prevents temporarily disabling a model without removing it from the configuration. Setting
minimum: 0would allow operators to set weight to 0 for graceful model removal from the rotation while keeping it in the config for future re-enablement.♻️ Suggested change
weight: type: integer description: | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. + A weight of 0 effectively disables the model. - minimum: 1 + minimum: 0
37-43: Consider if suspendDuration=0 is the right default for production.Same concern as in model-round-robin: the default of 0 means failed model invocations won't be suspended, potentially causing repeated failures. Consider a non-zero default for better resilience.
gateway/gateway-controller/pkg/utils/llm_transformer.go (1)
596-606: Add documentation and consider validation for parameter merging.The
mergeParamsfunction mergespathEntry.Params(user-defined policy parameters) withtemplateParams(system-level extraction configuration), with template parameters taking precedence on key conflicts.This merge order is intentional and correct—a parallel
mergeParamsimplementation ingateway/policy-engine/internal/registry/registry.godocuments the same semantics: "Runtime params override init params when keys conflict." Since template parameters define required system behavior (e.g., how to extractrequestModel,responseModel, token counts from responses), they should not be overridable by user-defined policy parameters.In current usage, parameter sets don't overlap: template parameters (
requestModel,responseModel,promptTokens, etc.) are distinct from user policy parameters (embeddingService,models,providers,similarity-threshold, etc.). However, for robustness, consider:
- Adding a comment explaining that template params intentionally override user params
- Adding validation to log or error if parameter name conflicts are detected, preventing silent overwrites
gateway/policies/model-round-robin/v0.1.0/roundrobin.go (1)
404-436: Consider caching compiled regex patterns.The regex pattern is compiled on every request (
regexp.Compile(regexPattern)). Since the pattern comes from policy configuration and doesn't change per request, consider compiling and caching it duringGetPolicyinitialization to avoid repeated compilation overhead.♻️ Suggested approach
Cache the compiled regex in
ModelRoundRobinPolicywhenrequestModel.locationis"pathParam":type ModelRoundRobinPolicy struct { currentIndex int mu sync.Mutex suspendedModels map[string]time.Time params ModelRoundRobinPolicyParams + compiledPathRegex *regexp.Regexp // Cached regex for pathParam extraction }Then compile it once in
GetPolicyand reuse it inextractModelFromPathandmodifyModelInPathParam.gateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.go (3)
107-252: Significant code duplication with round-robin policy.The
parseParamsfunction shares substantial logic withgateway/policies/model-round-robin/v0.1.0/roundrobin.go, including validation ofmodels,suspendDuration, andrequestModel. Consider extracting the common parsing logic into a shared utility package to reduce maintenance burden and ensure consistency.
335-371: Consider removing unusedsequenceMuor documenting its purpose.The
sequenceMuRWMutex is declared but only used withRLock/RUnlockinselectNextAvailableWeightedModel. SinceweightedSequenceis set once duringGetPolicyinitialization and never modified afterward (noLock()calls exist), the mutex appears unnecessary. Either:
- Remove
sequenceMuifweightedSequenceis truly immutable after initialization, or- Document why it exists (e.g., for future dynamic reconfiguration support)
373-401: Consider adding a maximum weight validation or alternative weighting algorithm to prevent excessive sequence memory usage.The current implementation pre-computes a weighted sequence by repeating each model weight times. Without a maximum weight constraint (currently only
weight >= 1is enforced), very large weights could create large sequences in memory. For example, 10 models withweight: 1000each would create a sequence with ~10,000 pointer entries.Consider one of these approaches:
- Add a maximum weight validation in
parseParams(simplest)- Normalize weights by their GCD to reduce sequence size while preserving proportions
- Use Smooth Weighted Round-Robin (SWRR): maintains dynamic current_weight per model; on each selection, increment by static weight, pick max, decrement by total. Produces smooth distribution without pre-computation but adds O(n) per-selection overhead.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
gateway/policies/model-round-robin/v0.1.0/go.sumis excluded by!**/*.sumgateway/policies/model-weighted-round-robin/v0.1.0/go.sumis excluded by!**/*.sum
📒 Files selected for processing (12)
docs/ai-gateway/llm/load-balancing/model-round-robin.mddocs/ai-gateway/llm/load-balancing/model-weighted-round-robin.mdgateway/gateway-controller/default-policies/model-round-robin-v0.1.0.yamlgateway/gateway-controller/default-policies/model-weighted-round-robin-v0.1.0.yamlgateway/gateway-controller/pkg/utils/llm_transformer.gogateway/policies/model-round-robin/v0.1.0/go.modgateway/policies/model-round-robin/v0.1.0/policy-definition.yamlgateway/policies/model-round-robin/v0.1.0/roundrobin.gogateway/policies/model-weighted-round-robin/v0.1.0/go.modgateway/policies/model-weighted-round-robin/v0.1.0/policy-definition.yamlgateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.gogateway/policies/policy-manifest-lock.yaml
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-18T08:49:08.028Z
Learnt from: RakhithaRR
Repo: wso2/api-platform PR: 504
File: gateway/policies/mcp-auth/v0.1.0/policy-definition.yaml:118-191
Timestamp: 2025-12-18T08:49:08.028Z
Learning: Ensure the policy engine resolves configuration paths in a case-insensitive manner when matching keys (e.g., jwtauth_v010 should match JWTAuth_v010 regardless of case). In reviews, look for references to path resolution logic and verify it normalizes keys (e.g., to lower or upper case) before comparison. Add tests to cover mixed-case Helm values and config keys and document this behavior in the policy engine's docs.
Applied to files:
gateway/policies/policy-manifest-lock.yamlgateway/policies/model-round-robin/v0.1.0/policy-definition.yamlgateway/policies/model-weighted-round-robin/v0.1.0/policy-definition.yaml
📚 Learning: 2025-12-19T07:14:39.045Z
Learnt from: nimsara66
Repo: wso2/api-platform PR: 521
File: gateway/gateway-controller/pkg/utils/llm_transformer.go:229-243
Timestamp: 2025-12-19T07:14:39.045Z
Learning: In the gateway LLM configuration flow, validation is performed in pkg/config/llm_validator.go before transformation in pkg/utils/llm_transformer.go, ensuring that required fields like upstream.Auth.Header and upstream.Auth.Value are non-nil before the transformer accesses them.
Applied to files:
gateway/gateway-controller/pkg/utils/llm_transformer.go
📚 Learning: 2025-12-06T11:57:08.456Z
Learnt from: BLasan
Repo: wso2/api-platform PR: 140
File: gateway/gateway-controller/pkg/utils/api_deployment.go:88-105
Timestamp: 2025-12-06T11:57:08.456Z
Learning: In the gateway-controller Go codebase (file: gateway/gateway-controller/pkg/utils/api_deployment.go), only the async/websub API kind is currently supported. async/websocket and async/sse API kinds are not yet implemented.
Applied to files:
gateway/gateway-controller/pkg/utils/llm_transformer.go
🧬 Code graph analysis (1)
gateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.go (4)
gateway/policies/model-round-robin/v0.1.0/roundrobin.go (1)
RequestModelConfig(54-57)sdk/gateway/policy/v1alpha/context.go (2)
RequestContext(56-70)Body(4-16)sdk/gateway/policy/v1alpha/action.go (3)
RequestAction(4-7)ImmediateResponse(32-37)UpstreamRequestModifications(16-24)sdk/utils/jsonpath.go (2)
ExtractStringValueFromJsonpath(14-37)SetValueAtJSONPath(114-212)
🪛 Gitleaks (8.30.0)
docs/ai-gateway/llm/load-balancing/model-round-robin.md
[high] 57-59: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
docs/ai-gateway/llm/load-balancing/model-weighted-round-robin.md
[high] 72-74: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🪛 LanguageTool
docs/ai-gateway/llm/load-balancing/model-round-robin.md
[style] ~46-~46: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ured) and stores it for reference. 3. Model Modification: The policy modifies the...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
docs/ai-gateway/llm/load-balancing/model-weighted-round-robin.md
[grammar] ~32-~32: Ensure spelling is correct
Context: ... to total weight of all models. Must be atleast 1. | | endpoint | string | No | The e...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~49-~49: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ation and stores it for reference. 4. Model Modification: The policy modifies the...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🔇 Additional comments (32)
gateway/policies/model-weighted-round-robin/v0.1.0/go.mod (1)
1-5: LGTM! Consistent with model-round-robin module.The module declaration is consistent with the model-round-robin policy, using the same Go version and SDK dependency.
gateway/policies/policy-manifest-lock.yaml (1)
84-90: LGTM! Policy manifest entries are correctly formatted.The new policy entries for model-round-robin and model-weighted-round-robin follow the established pattern and are consistent with existing policies in the manifest.
gateway/policies/model-round-robin/v0.1.0/go.mod (1)
3-5: Go version 1.23.0 and SDK v0.3.0 are verified as valid and compatible.Go 1.23.0 was released on August 13, 2024 as a stable version. SDK v0.3.0 is the internal wso2/api-platform/sdk module version used consistently across 20+ policy modules in the repository, with successful dependency resolution confirmed by the go.sum file containing the resolved hash for v0.3.0.
gateway/policies/model-round-robin/v0.1.0/policy-definition.yaml (1)
1-40: LGTM! Well-structured policy definition.The schema validation rules are appropriate:
- Required
modelsarray with at least one entry ensures valid configuration- URL pattern validation for optional
endpointfieldsuspendDurationwith sensible default (0) and minimum constraint- Clear descriptions for all parameters
gateway/policies/model-weighted-round-robin/v0.1.0/policy-definition.yaml (1)
1-49: LGTM! Schema is consistent with round-robin variant.The weighted round-robin schema appropriately extends the base round-robin policy:
- Required
weightfield with minimum value of 1 ensures valid weight distribution- Consistent validation rules across both policy variants
- Clear parameter descriptions
docs/ai-gateway/llm/load-balancing/model-round-robin.md (1)
1-210: LGTM! Comprehensive and well-structured documentation.The documentation thoroughly covers:
- Policy features and capabilities
- Configuration parameters with clear tables
- Multiple practical examples
- Model suspension behavior
- Various use cases
- Request model extraction from different locations
The static analysis hints about the auth token and sentence structure are false positives in this context - the
<openai-apikey>is an appropriate placeholder, and the technical writing style is clear.gateway/gateway-controller/pkg/utils/llm_transformer.go (4)
50-64: LGTM! Proper error handling for template retrieval.The template retrieval logic includes:
- Type assertion validation with both
okand nil checks- Comprehensive error wrapping with context
- Proper propagation of errors from nested calls
209-219: LGTM! Consistent template retrieval pattern.The implementation follows the same pattern as
transformProxy, ensuring consistency across both transform paths.
535-594: LGTM! Clean parameter extraction logic.The
buildTemplateParamsfunction:
- Properly validates nil template input
- Consistently extracts location and identifier for each template parameter
- Gracefully handles optional fields by checking for nil
- Returns empty map when no template parameters are configured, which is appropriate
165-165: LGTM! Consistent parameter merging across all policy attachment points.The merged parameters are consistently applied at all three policy attachment locations:
- Line 165:
transformProxy- Line 373:
transformProvider(AllowAll mode)- Line 471:
transformProvider(DenyAll mode)This ensures that template parameters are available to policies regardless of the access control mode or whether it's a provider or proxy configuration.
Also applies to: 373-373, 471-471
docs/ai-gateway/llm/load-balancing/model-weighted-round-robin.md (1)
1-224: LGTM! Excellent documentation with clear weight distribution explanations.The documentation provides:
- Clear explanation of weighted distribution mechanics
- Concrete examples showing weight calculations (lines 53-63, 206-213)
- Comprehensive parameter tables
- Practical use cases for weighted balancing
- Detailed model suspension behavior
The weight distribution examples particularly help users understand how the policy works.
gateway/policies/model-round-robin/v0.1.0/roundrobin.go (15)
1-65: LGTM!The license header, imports, constants, and type definitions are well-structured. The use of
sync.Mutexfor protecting shared state (currentIndexandsuspendedModels) is appropriate for concurrent access.
67-84: LGTM!The
GetPolicyfactory function correctly parses parameters and initializes the policy with proper default values.
86-214: LGTM!The parameter parsing is thorough with clear error messages. The validation covers required fields, type checks, and value constraints appropriately.
216-231: LGTM!The
extractIntfunction correctly handles the various numeric types that may come from JSON unmarshaling, including the check for fractional parts in float64 values.
233-241: LGTM!The processing mode configuration is appropriate for a policy that needs to read and potentially modify both request headers and body content.
243-272: LGTM!The request handling flow is well-structured: extract original model, select next available model via round-robin, and modify the request. The 503 response when all models are unavailable is appropriate.
274-295: LGTM!The response handling correctly identifies error conditions (5xx and 429) and implements thread-safe model suspension with proper mutex locking.
297-329: LGTM!The round-robin selection with suspension handling is correctly implemented. The lock protects concurrent access, and expired suspensions are properly cleaned up.
331-355: LGTM!Clean dispatch pattern for extracting the model from different request locations.
357-369: LGTM!Proper null checks before attempting JSON path extraction.
371-402: LGTM!The query parameter extraction properly handles URL decoding and query string parsing.
438-456: LGTM!Clean dispatch pattern mirroring the extraction logic.
491-498: LGTM!Simple and correct header modification.
500-549: LGTM!The query parameter modification correctly handles both existing and new query strings, and properly uses the
:pathpseudo-header for Envoy compatibility.
551-616: LGTM!The path parameter modification correctly uses regex capture group indices for precise replacement while preserving the query string. The same regex caching suggestion from the extraction function applies here.
gateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.go (6)
40-75: LGTM!The type definitions appropriately extend the round-robin concept with weight support. Using a separate
RWMutexfor the weighted sequence is a good design choice for reducing lock contention.
77-105: LGTM!The factory function correctly initializes the weighted sequence during policy creation.
254-279: LGTM!The
extractIntandModeimplementations are correct. These are also candidates for the shared utility extraction mentioned earlier.
281-333: LGTM!The request/response handling follows the same pattern as the round-robin policy, correctly adapted for weighted selection.
403-509: LGTM with prior notes.The extraction methods are correct. The same regex caching suggestion from the round-robin review applies to
extractModelFromPathhere. These methods are also candidates for shared utility extraction given the duplication.
511-689: LGTM with prior notes.The modification methods mirror the round-robin implementation. The same observation about silent failure in
modifyModelInPayloadapplies here as well.
| | Property | Type | Required | Description | | ||
| |----------|------|----------|-------------| | ||
| | `model` | string | Yes | The AI model name to use for load balancing. | | ||
| | `weight` | integer | Yes | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. Must be atleast 1. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the typo: "atleast" should be "at least".
📝 Proposed fix
-| `weight` | integer | Yes | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. Must be atleast 1. |
+| `weight` | integer | Yes | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. Must be at least 1. |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | `weight` | integer | Yes | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. Must be atleast 1. | | |
| | `weight` | integer | Yes | The weight assigned to this model for distribution. Higher weights mean more requests will be routed to this model. Weight is relative to total weight of all models. Must be at least 1. | |
🧰 Tools
🪛 LanguageTool
[grammar] ~32-~32: Ensure spelling is correct
Context: ... to total weight of all models. Must be atleast 1. | | endpoint | string | No | The e...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🤖 Prompt for AI Agents
In @docs/ai-gateway/llm/load-balancing/model-weighted-round-robin.md at line 32,
Fix the typo in the documentation table: replace "atleast 1" with "at least 1"
in the `weight` description (the table cell that reads "Must be atleast 1.").
Update the sentence so it reads "Must be at least 1." ensuring spacing and
capitalization match the surrounding style.
| // modifyModelInPayload modifies the model in request body using JSONPath | ||
| func (p *ModelRoundRobinPolicy) modifyModelInPayload(ctx *policy.RequestContext, newModel string, jsonPath string) policy.RequestAction { | ||
| if ctx.Body == nil || ctx.Body.Content == nil { | ||
| return policy.UpstreamRequestModifications{} | ||
| } | ||
|
|
||
| // Parse request body | ||
| var payloadData map[string]interface{} | ||
| if err := json.Unmarshal(ctx.Body.Content, &payloadData); err != nil { | ||
| slog.Debug("ModelRoundRobin: Error unmarshaling request body", "error", err) | ||
| return policy.UpstreamRequestModifications{} | ||
| } | ||
|
|
||
| // Update model in payload | ||
| if err := utils.SetValueAtJSONPath(payloadData, jsonPath, newModel); err != nil { | ||
| slog.Debug("ModelRoundRobin: Error setting model in request body", "jsonPath", jsonPath, "error", err) | ||
| return policy.UpstreamRequestModifications{} | ||
| } | ||
|
|
||
| // Marshal back to JSON | ||
| updatedPayload, err := json.Marshal(payloadData) | ||
| if err != nil { | ||
| slog.Debug("ModelRoundRobin: Error marshaling updated request body", "error", err) | ||
| return policy.UpstreamRequestModifications{} | ||
| } | ||
|
|
||
| slog.Debug("ModelRoundRobin: Modified request model in payload", "originalModel", ctx.Metadata[MetadataKeyOriginalModel], "newModel", newModel, "jsonPath", jsonPath) | ||
|
|
||
| return policy.UpstreamRequestModifications{ | ||
| Body: updatedPayload, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent failure may cause unexpected behavior.
When JSON unmarshaling, path setting, or marshaling fails, the function logs at debug level and returns empty modifications. This means the request proceeds with the original model unchanged, but the policy has already recorded a different model in metadata (MetadataKeySelectedModel). This mismatch could cause issues in OnResponse where the wrong model might get suspended.
Consider returning an error response (e.g., 500) when the modification fails to maintain consistency between the selected model and what's actually sent upstream.
🤖 Prompt for AI Agents
In @gateway/policies/model-round-robin/v0.1.0/roundrobin.go around lines 458 -
489, modifyModelInPayload currently swallows JSON unmarshal/set/marshal errors
and returns empty modifications while MetadataKeySelectedModel may already be
set, causing inconsistency; change modifyModelInPayload to return an explicit
failure RequestAction when any of the JSON operations fail (instead of
policy.UpstreamRequestModifications{}), e.g., construct and return a failure
response with a 500 status and explanatory message so the request is
aborted/handled upstream; reference the modifyModelInPayload function and ensure
downstream code (e.g., OnResponse logic that checks MetadataKeySelectedModel)
can rely on the modification either succeeding or the request having been
failed.
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either implied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in license header.
The license text is missing "express or" in this line. Compare with the round-robin policy's license which correctly states: "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied."
📝 Proposed fix
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either implied.
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @gateway/policies/model-weighted-round-robin/v0.1.0/weightedroundrobin.go at
line 12, Fix the typo in the license header of weightedroundrobin.go by
inserting the missing words "express or" so the sentence reads "WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied."; update the
header text in the file's top comment block (same block containing "WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either implied.") to match the correct
phrasing used in the round-robin policy.
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.