Skip to content

Conversation

@natedemoss
Copy link

@natedemoss natedemoss commented Nov 25, 2025

What does this PR do?

Adds a production-oriented ResponseGuardrailSpec model to src/llama_stack_api/agents.py enabling structured guardrail configuration during response generation. Supports both string guardrail IDs and inline specs via the union ResponseGuardrail = str | ResponseGuardrailSpec. Fields include: type, description, enabled, severity, action, policy_id, version, categories, thresholds, max_violations, config, tags, metadata. Enforces strict schema (extra='forbid') and provides a normalized() helper for category cleanup.

Test Plan

Default construction

g = ResponseGuardrailSpec(type="llama-guard")
assert g.enabled is True
assert g.severity is None
 ```
### Enum validation (should fail)
```python
from pydantic import ValidationError
try:
 ResponseGuardrailSpec(type="x", severity="critical")
 assert False, "Expected ValidationError"
except ValidationError:
 pass

Extra key rejection

try:
    ResponseGuardrailSpec(type="x", unknown=1)
    assert False
except ValidationError:
    pass

Category normalization

g = ResponseGuardrailSpec(type="x", categories=[" Violence ", "Self-Harm"]).normalized()
assert g.categories == ["violence", "self-harm"]

Union usage in API (integration)

  • POST with guardrails: ["llama-guard"] → succeeds.
  • POST with guardrails: [{"type":"llama-guard","severity":"warn","categories":["violence"]}] → succeeds.

OpenAPI/spec regeneration

  • Run schema generation script.
  • Verify guardrails now shows oneOf (string | object) and object schema lists all new fields.

Negative thresholds (optional future test if validation added)

  • Add validator; ensure invalid values raise ValidationError.

All tests pass locally (manual execution). Add automated unit test file in follow-up PR if not already present.

-- Nate DeMoss

Added fields for guardrail configuration including description, enabled, severity, action, policy_id, version, categories, thresholds, max_violations, config, tags, and metadata.
@meta-cla
Copy link

meta-cla bot commented Nov 25, 2025

Hi @natedemoss!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@natedemoss natedemoss changed the title Enhance ResponseGuardrailSpec with additional fields feat: Enhance ResponseGuardrailSpec with additional fields Nov 25, 2025
@meta-cla
Copy link

meta-cla bot commented Nov 25, 2025

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 25, 2025
Copy link
Collaborator

@cdoern cdoern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question to get started

Fields
------
type: Identifier for the guardrail implementation (e.g. 'llama-guard', 'content-filter').
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these look great, but can I ask where these are coming from?

Copy link
Author

@natedemoss natedemoss Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the “type” values come from the backend guardrail config, not from agents.py. They’re IDs the server maps to concrete handlers (e.g., llama-guard, content-filter). Resolution happens during create_openai_response on the backend. If helpful, I can add a doc note in ResponseGuardrailSpec pointing to the registry module or config path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I can make a new PR with a comment with less specificality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants