Skip to content

Add PromptGuard guardrail integration#24268

Open
acebot712 wants to merge 6 commits intoBerriAI:mainfrom
acebot712:add-promptguard-guardrail
Open

Add PromptGuard guardrail integration#24268
acebot712 wants to merge 6 commits intoBerriAI:mainfrom
acebot712:add-promptguard-guardrail

Conversation

@acebot712
Copy link

@acebot712 acebot712 commented Mar 21, 2026

Closes #24272

Summary

Add PromptGuard as a first-class guardrail vendor in LiteLLM's proxy, appearing alongside existing partners in the Guardrail Garden UI.

PromptGuard is an AI security gateway that provides:

  • Prompt injection detection with 94.9% F1 score (100% precision, 90.4% recall on 5,384 test cases)
  • PII detection & redaction with configurable entity types
  • Topic filtering and entity blocklists
  • Hallucination detection
  • Self-hostable with drop-in proxy integration

What's included

Backend (Python):

  • PROMPTGUARD added to SupportedGuardrailIntegrations enum
  • PromptGuardGuardrailCustomGuardrail subclass implementing apply_guardrail via POST /api/v1/guard
    • decision: "allow" → pass through unchanged
    • decision: "block" → raise GuardrailRaisedException with threat details
    • decision: "redact" → return modified inputs with redacted content (updates both texts and structured_messages)
  • Configurable block_on_error (fail-closed by default, fail-open optional)
  • Explicit supported_event_hooks declaration (pre_call, post_call)
  • Image passthrough via GenericGuardrailAPIInputs.images
  • Pydantic config model with api_key, api_base, block_on_error, ui_friendly_name()
  • Auto-discovered via guardrail_hooks/promptguard/__init__.py registries (zero manual wiring)

Frontend (TypeScript):

  • Partner card in Guardrail Garden with eval scores
  • Preset configuration for quick setup
  • Logo in guardrailLogoMap

Documentation:

  • Full docs page at docs/proxy/guardrails/promptguard.md
  • Added to sidebar navigation

Tests:

  • 40 unit tests across 8 test classes covering configuration, allow/block/redact decisions, fail-open resilience, image passthrough, request payload construction, error handling, config model, and registry wiring
  • All tests use mocked HTTP responses (no real API calls)

Architecture

App → LiteLLM Proxy → PromptGuard API (/api/v1/guard) → decision: allow/block/redact
                    → LLM Provider (if allowed/redacted)

Files changed

File Type
litellm/types/guardrails.py Modified — enum entry
litellm/types/proxy/guardrails/guardrail_hooks/promptguard.py New — config model
litellm/proxy/guardrails/guardrail_hooks/promptguard/promptguard.py New — guardrail hook
litellm/proxy/guardrails/guardrail_hooks/promptguard/__init__.py New — registry
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_promptguard.py New — 40 tests
ui/litellm-dashboard/public/assets/logos/promptguard.svg New — logo
ui/litellm-dashboard/src/components/guardrails/guardrail_garden_data.ts Modified — partner card
ui/litellm-dashboard/src/components/guardrails/guardrail_garden_configs.ts Modified — preset
ui/litellm-dashboard/src/components/guardrails/guardrail_info_helpers.tsx Modified — logo map
docs/my-website/docs/proxy/guardrails/promptguard.md New — documentation
docs/my-website/sidebars.js Modified — sidebar entry

Test plan

  • poetry run black --check passes on all new/modified Python files
  • poetry run ruff check passes on all new/modified Python files
  • poetry run mypy --ignore-missing-imports passes (0 issues)
  • check-circular-imports passes
  • check-import-safety passes (from litellm import * succeeds)
  • 40/40 unit tests pass (pytest tests/test_litellm/proxy/guardrails/guardrail_hooks/test_promptguard.py)
  • npm run build succeeds (UI compiles with no errors)
  • npm run test passes (373 test files, 3626 tests)
  • CLA signed

Add PromptGuard as a first-class guardrail vendor in LiteLLM's proxy,
supporting prompt injection detection, PII redaction, topic filtering,
entity blocklists, and hallucination detection via PromptGuard's
/api/v1/guard API endpoint.

Backend:
- Add PROMPTGUARD to SupportedGuardrailIntegrations enum
- Implement PromptGuardGuardrail (CustomGuardrail subclass) with
  apply_guardrail handling allow/block/redact decisions
- Add Pydantic config model with api_key, api_base, ui_friendly_name
- Auto-discovered via guardrail_hooks/promptguard/__init__.py registries

Frontend:
- Add PromptGuard partner card to Guardrail Garden with eval scores
- Add preset configuration for quick setup
- Add logo to guardrailLogoMap

Tests:
- 30 unit tests covering configuration, allow/block/redact actions,
  request payload construction, error handling, config model, and
  registry wiring
@vercel
Copy link

vercel bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 21, 2026 6:03am

Request Review

@acebot712
Copy link
Author

@greptileai review

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing acebot712:add-promptguard-guardrail (bfe69af) with main (d8e4fc4)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR adds PromptGuard as a new first-class guardrail vendor in LiteLLM, following the established pattern used by the ~25 other guardrail integrations in the codebase. The implementation covers the full stack: Python CustomGuardrail subclass, Pydantic config model, auto-discovery registry, UI Guardrail Garden card with eval scores, and 40 mocked unit tests.

Previously flagged issues (HTTPX client allocation before credential validation, dict[str, Any] Python 3.8 incompatibility, missing request timeout, unhandled JSONDecodeError, and the redact path injecting a texts key when none was present) have all been addressed in follow-up commits.

Key changes:

  • SupportedGuardrailIntegrations.PROMPTGUARD enum entry and PromptGuardConfigModel added to the LitellmParams union
  • PromptGuardGuardrail.apply_guardrail correctly handles allow (pass-through), block (raise GuardrailRaisedException), and redact (update structured_messages and/or texts — guarded to avoid creating a texts key that didn't exist before)
  • block_on_error defaults to True (fail-closed) and can be overridden via the PROMPTGUARD_BLOCK_ON_ERROR env var
  • 10-second request timeout on the outbound POST /api/v1/guard call
  • All 40 tests use mocked HTTP responses with no real network calls, satisfying the CI/CD constraint

Confidence Score: 4/5

  • This PR is safe to merge — it adds a self-contained new integration with no changes to shared code paths beyond enum and type registration.
  • All previously identified issues have been addressed in follow-up commits. The integration follows established patterns, tests are comprehensive and fully mocked, and the change surface is well-isolated. One point withheld because self.api_key retains an Optional[str] type annotation after the non-None guard in __init__, meaning downstream use in the HTTP header dict is technically unsound to the type system even though it's safe at runtime.
  • No files require special attention — the implementation is straightforward and well-tested.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/promptguard/promptguard.py Core guardrail hook implementing allow/block/redact decision handling, credential validation, HTTP request construction, and fail-open/fail-closed error handling. Previously flagged issues (resource allocation order, Python 3.8 typing, timeout, JSON error handling, redact path mutation) have all been addressed.
litellm/proxy/guardrails/guardrail_hooks/promptguard/init.py Registry wiring for the new guardrail — exposes guardrail_initializer_registry and guardrail_class_registry using the SupportedGuardrailIntegrations.PROMPTGUARD enum value. Consistent with all other hook modules in the same directory.
litellm/types/proxy/guardrails/guardrail_hooks/promptguard.py Pydantic config model with api_key, api_base, and block_on_error fields, correctly included in LitellmParams via multiple inheritance. ui_friendly_name() returns "PromptGuard" which feeds the logo lookup chain in the UI.
litellm/types/guardrails.py Adds PROMPTGUARD = "promptguard" to SupportedGuardrailIntegrations enum and imports PromptGuardConfigModel into LitellmParams. Change is minimal and follows existing pattern.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_promptguard.py 40 unit tests across 8 classes. All HTTP calls are mocked with patch.object. Tests cover configuration, allow/block/redact decisions, error handling (fail-open/fail-closed), payload construction, image passthrough, and registry wiring.
ui/litellm-dashboard/src/components/guardrails/guardrail_garden_configs.ts Adds a promptguard preset with provider: "Promptguard". This matches the key generated by populateGuardrailProviderMap for a snake_case key of "promptguard" (capitalizes first word → "Promptguard").
ui/litellm-dashboard/src/components/guardrails/guardrail_info_helpers.tsx Adds PromptGuard key to guardrailLogoMap pointing to the new SVG. The lookup chain (guardrail_provider_map → DynamicGuardrailProviders → guardrailLogoMap) resolves correctly because ui_friendly_name() returns "PromptGuard" which matches the unquoted object key.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LiteLLMProxy as LiteLLM Proxy
    participant PG as PromptGuardGuardrail
    participant PGAPI as PromptGuard API
    participant LLM as LLM Provider

    Client->>LiteLLMProxy: POST /v1/chat/completions
    LiteLLMProxy->>PG: apply_guardrail(inputs, "request")
    PG->>PG: Build payload (messages, direction="input", model?, images?)
    PG->>PGAPI: POST /api/v1/guard (X-API-Key, timeout=10s)
    alt API error
        PGAPI-->>PG: Exception
        alt block_on_error=True
            PG-->>LiteLLMProxy: raise exception
            LiteLLMProxy-->>Client: 500 error
        else block_on_error=False
            PG-->>LiteLLMProxy: return inputs unchanged
        end
    else decision = "allow"
        PGAPI-->>PG: {"decision": "allow"}
        PG-->>LiteLLMProxy: return inputs unchanged
        LiteLLMProxy->>LLM: Forward request
        LLM-->>LiteLLMProxy: LLM response
        LiteLLMProxy->>PG: apply_guardrail(inputs, "response")
        PG->>PGAPI: POST /api/v1/guard (direction="output")
        PGAPI-->>PG: {"decision": "allow"}
        PG-->>LiteLLMProxy: return inputs unchanged
        LiteLLMProxy-->>Client: 200 response
    else decision = "block"
        PGAPI-->>PG: {"decision": "block", "threat_type": "...", "confidence": ...}
        PG-->>LiteLLMProxy: raise GuardrailRaisedException
        LiteLLMProxy-->>Client: 400 Blocked by PromptGuard
    else decision = "redact"
        PGAPI-->>PG: {"decision": "redact", "redacted_messages": [...]}
        PG->>PG: Update structured_messages and/or texts in inputs
        PG-->>LiteLLMProxy: return modified inputs
        LiteLLMProxy->>LLM: Forward redacted request
        LLM-->>LiteLLMProxy: LLM response
        LiteLLMProxy-->>Client: 200 response
    end
Loading

Last reviewed commit: "Fix CI lint: black f..."

- P1: Update structured_messages (not just texts) when PromptGuard
  returns a redact decision, so PII redaction is effective for the
  primary LLM message path
- P2: Validate credentials before allocating the HTTPX client so
  resources aren't acquired if PromptGuardMissingCredentials is raised
- Add tests for structured_messages redaction and texts-only redaction
- Add block_on_error config (default fail-closed, configurable fail-open)
- Declare supported_event_hooks (pre_call, post_call) like other vendors
- Forward images from GenericGuardrailAPIInputs to PromptGuard API
- Wrap API call in try/except for resilient error handling
- Add comprehensive documentation page with config examples
- Register docs page in sidebar alongside other guardrail providers
- Expand test suite from 32 to 40 tests covering new functionality
- Add explicit 10s timeout to async_handler.post() to prevent
  indefinite hangs when PromptGuard API is unresponsive
- Guard redact path: only update inputs["texts"] when the key
  was originally present, avoiding phantom key injection
- Add test: redact with structured_messages only does not create
  texts key (41 tests total)
@acebot712
Copy link
Author

acebot712 commented Mar 21, 2026

All @greptileai review comments have been addressed:

@ishaan-jaff - would appreciate your review when you get a chance. This adds PromptGuard as a first-class guardrail provider (backend hook, UI garden card, docs page, 41 tests). Happy to address any further feedback.

…arams

- Reformat promptguard.py to match CI black version (parenthesization)
- Add PromptGuardConfigModel as base class of LitellmParams for proper
  Pydantic schema validation, consistent with all other guardrail vendors
- Use litellm_params.block_on_error directly (now a typed field)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add PromptGuard as a first-class guardrail provider

1 participant