Feat(guardrail): Adding support for custom Ovalix guardrail#21887
Feat(guardrail): Adding support for custom Ovalix guardrail#21887shalom-ovalix wants to merge 4 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Shalom Jamil seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Greptile SummaryThis PR adds a new Ovalix guardrail integration that supports pre-call (user input) and post-call (LLM output) checkpoints via the Ovalix Tracker service, with optional correction, anonymization, and blocking of content.
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/ovalix/ovalix.py | Core guardrail implementation with pre/post-call checkpoint support. The pre-call path rebuilds the texts list from structured_messages rather than operating on the input texts, which may cause length mismatches with the upstream handler's task_mappings for multimodal content. Otherwise well-structured with proper error handling, session management, and blocking logic. |
| litellm/proxy/guardrails/guardrail_hooks/ovalix/init.py | Standard guardrail registration following existing patterns. Correctly registers initializer and class in both registries, reads config from litellm_params, and adds callback to logging_callback_manager. |
| litellm/types/guardrails.py | Adds OVALIX enum value to SupportedGuardrailIntegrations, imports OvalixGuardrailConfigModel, and adds it to LitellmParams multiple inheritance. Follows the same pattern used by all other guardrail integrations. |
| litellm/types/proxy/guardrails/guardrail_hooks/ovalix.py | Clean Pydantic config model defining Ovalix-specific configuration fields (tracker_api_base, tracker_api_key, application_id, checkpoint IDs). Follows the GuardrailConfigModel base class pattern correctly. |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/test_ovalix.py | Comprehensive test suite with 20+ tests covering initialization, config validation, checkpoint calls, pre/post-call guardrail behavior, blocking, anonymization, error handling, and edge cases. All tests use mocked HTTP calls (no real network requests). Good coverage of core scenarios. |
Sequence Diagram
sequenceDiagram
participant Client
participant Proxy as LiteLLM Proxy
participant Handler as Guardrail Translation Handler
participant Ovalix as OvalixGuardrail
participant Tracker as Ovalix Tracker API
Note over Client,Tracker: Pre-call (request) flow
Client->>Proxy: POST /chat/completions
Proxy->>Handler: process_input_messages(data)
Handler->>Handler: Extract texts & task_mappings
Handler->>Ovalix: apply_guardrail(inputs, input_type="request")
Ovalix->>Ovalix: _generate_post_guardrail_text(messages)
loop For each user message (reversed)
Ovalix->>Tracker: POST /checkpoint (pre_checkpoint_id)
Tracker-->>Ovalix: {action_type, modified_data}
alt action_type == "block" (last message)
Ovalix-->>Handler: raise OvalixGuardrailBlockedException
else action_type == "block" (older message)
Ovalix->>Ovalix: Replace text with block message
else allow / anonymize
Ovalix->>Ovalix: Use modified_data or original
end
end
Ovalix-->>Handler: Updated inputs with texts
Handler->>Handler: Map texts back to messages
Handler-->>Proxy: Modified request data
Proxy->>Proxy: Call LLM
Note over Client,Tracker: Post-call (response) flow
Proxy->>Handler: process_output_response(response)
Handler->>Handler: Extract response texts
Handler->>Ovalix: apply_guardrail(inputs, input_type="response")
Ovalix->>Ovalix: _generate_post_guardrail_llm_responses(texts)
loop For each response text (reversed)
Ovalix->>Tracker: POST /checkpoint (post_checkpoint_id)
Tracker-->>Ovalix: {action_type, modified_data}
end
Ovalix-->>Handler: Updated texts
Handler->>Handler: Map texts back to response choices
Handler-->>Proxy: Modified response
Proxy-->>Client: Final response
Last reviewed commit: fff1bb3
e4aa4b1 to
c9d1266
Compare
fbfa09e to
ba924b1
Compare
68df6f6 to
37cb2d2
Compare
8491f13 to
def509d
Compare
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_ovalix.py
Outdated
Show resolved
Hide resolved
3bf9825 to
4284e63
Compare
4284e63 to
83f293a
Compare
| messages = inputs.get("structured_messages") or [] | ||
| if not messages: | ||
| return inputs | ||
|
|
||
| if self._pre_checkpoint_id: | ||
| post_guardrail_texts = await self._generate_post_guardrail_text( | ||
| messages, actor, session_id | ||
| ) | ||
| return {**inputs, "texts": post_guardrail_texts} |
There was a problem hiding this comment.
Pre-call texts list may not match upstream task mappings
The upstream _apply_guardrail_responses_to_input_texts in handler.py expects the returned texts list to have a 1:1 correspondence with text_task_mappings, which tracks (message_index, content_index) for each extracted text item. This code ignores the input texts and instead builds a new list from structured_messages — one entry per message.
This works correctly for the common case (all messages have simple string content), but breaks when a message has list content (multimodal). For example, a message with content = [{"type": "text", "text": "a"}, {"type": "text", "text": "b"}] produces 2 entries in the upstream's text_task_mappings, but _generate_post_guardrail_text inserts the raw list as a single entry. The resulting length mismatch means the upstream handler silently skips applying guardrail corrections to some messages.
Consider operating on inputs.get("texts", []) directly (like the response path does) instead of deriving texts from structured_messages, so the returned list always matches the upstream mapping. Alternatively, if working with structured_messages is intentional (e.g., to inspect roles), ensure the returned texts list maintains the same length and order as the input texts.
| """ | ||
| Generate post-guardrail text for the given messages. | ||
|
|
||
| Args: | ||
| messages: List of messages | ||
| actor: Actor | ||
| session_id: Session ID | ||
| request_data: Request data | ||
|
|
||
| Returns: | ||
| List of post-guardrail texts | ||
| """ |
There was a problem hiding this comment.
Stale docstring parameter
The docstring references request_data: Request data as a parameter, but it was removed from the method signature.
| """ | |
| Generate post-guardrail text for the given messages. | |
| Args: | |
| messages: List of messages | |
| actor: Actor | |
| session_id: Session ID | |
| request_data: Request data | |
| Returns: | |
| List of post-guardrail texts | |
| """ | |
| ) -> List[Any]: | |
| """ | |
| Generate post-guardrail text for the given messages. | |
| Args: | |
| messages: List of messages | |
| actor: Actor | |
| session_id: Session ID | |
| Returns: | |
| List of post-guardrail texts | |
| """ |
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes