Skip to content

Conversation

@rkritika1508
Copy link
Collaborator

@rkritika1508 rkritika1508 commented Jan 21, 2026

Summary

Target issue is #21.
Explain the motivation for making this change. What existing problem does the pull request solve?
This pull request does two things -

  1. Creates a "validate" endpoint. We have removed input and output guardrail API since they were performing the same thing. In it's place, we have a validate API which takes same parameters but can be applied to both input and output guardrails.
  2. Implements the "rephrase" on_fail action. If any validator fails for a given message and the on_fail action is set to rephrase, we will nudge the user / LLM to rephrase the statement without the harmful content.
image
  1. When on_fail exception is "fix":
{
    "request_id": "3f6a9d2e-8c47-4b8a-9f3c-1d2a6e7f4c91",
    "input": "Rahul Mehta recently moved to Bengaluru, Karnataka, and can be contacted at +91-98765-43210 or via email at [email protected]. His Aadhaar number is 470821987760, PAN is BQTPM7421K, voter ID is DL/05/123/456789, and passport number is K8239471. He owns a car registered as KA03MN4587 and runs a small business registered under GSTIN 29BQTPM7421K1Z5. Rahul recently paid a hospital bill using his credit card 4539 1488 0343 6467 and consulted a doctor holding medical license MH/MC/2021/778899. During an online consultation from IP address 192.168.1.42, he accessed the clinic’s website at https://www.healthcare-consult.in. For an international transfer, he shared his IBAN DE89 3704 0044 0532 0130 00 and later uploaded a notarized document with registration number NRP-IND-2024-556782.",
    "validators": [
        {
            "type": "uli_slur_match",
            "severity": "all"
        },
        {
            "type": "ban_list",
            "banned_words": [
                "sonography"
            ]
        },
        {
            "type": "pii_remover"
        }
    ]
}

Response - 
{
  "success": true,
  "data": {
    "response_id": "cedd98e0-79cb-469b-ab3a-ed5d415c5008",
    "rephrase_needed": false,
    "safe_text": "<PERSON> recently moved to <LOCATION>, <LOCATION>, and can be contacted at <PHONE_NUMBER> or via email at <EMAIL_ADDRESS>. His Aadhaar number is <IN_AADHAAR>, PAN is <IN_PAN>, voter ID is DL/<PHONE_NUMBER>, and passport number is <IN_PASSPORT>. He owns a car <IN_PAN> as <IN_VEHICLE_REGISTRATION> and runs a small business <IN_PAN> under GSTIN 29BQTPM7421K1Z5. <PERSON> recently paid a hospital bill using his credit card <CREDIT_CARD> and consulted a doctor holding medical license MH/MC/<PHONE_NUMBER>. During an online consultation from IP address <IP_ADDRESS>, he accessed the clinic’s website at <URL>. For an international transfer, he shared his IBAN <IBAN_CODE> and later uploaded a notarized document with registration number NRP<IN_PAN><PHONE_NUMBER>."
  }
}
  1. When on_fail action is "exception":
{
    "request_id": "3f6a9d2e-8c47-4b8a-9f3c-1d2a6e7f4c91",
    "input": "Rahul Mehta recently moved to Bengaluru, Karnataka, and can be contacted at +91-98765-43210 or via email at [email protected]. His Aadhaar number is 470821987760, PAN is BQTPM7421K, voter ID is DL/05/123/456789, and passport number is K8239471. He owns a car registered as KA03MN4587 and runs a small business registered under GSTIN 29BQTPM7421K1Z5. Rahul recently paid a hospital bill using his credit card 4539 1488 0343 6467 and consulted a doctor holding medical license MH/MC/2021/778899. During an online consultation from IP address 192.168.1.42, he accessed the clinic’s website at https://www.healthcare-consult.in. For an international transfer, he shared his IBAN DE89 3704 0044 0532 0130 00 and later uploaded a notarized document with registration number NRP-IND-2024-556782.",
    "validators": [
        {
            "type": "uli_slur_match",
            "severity": "all"
        },
        {
            "type": "ban_list",
            "banned_words": [
                "sonography"
            ]
        },
        {
            "type": "pii_remover",
            "on_fail": "exception"
        }
    ]
}

Response - 
{
  "success": false,
  "data": {
    "response_id": "07448988-cd89-4187-b7d3-a5f45191c605",
    "rephrase_needed": false
  },
  "error": "Validation failed for field with errors: PII detected in the text."
}
  1. When on_fail action is "rephrase":
{
    "request_id": "3f6a9d2e-8c47-4b8a-9f3c-1d2a6e7f4c91",
    "input": "Rahul Mehta recently moved to Bengaluru, Karnataka, and can be contacted at +91-98765-43210 or via email at [email protected]. His Aadhaar number is 470821987760, PAN is BQTPM7421K, voter ID is DL/05/123/456789, and passport number is K8239471. He owns a car registered as KA03MN4587 and runs a small business registered under GSTIN 29BQTPM7421K1Z5. Rahul recently paid a hospital bill using his credit card 4539 1488 0343 6467 and consulted a doctor holding medical license MH/MC/2021/778899. During an online consultation from IP address 192.168.1.42, he accessed the clinic’s website at https://www.healthcare-consult.in. For an international transfer, he shared his IBAN DE89 3704 0044 0532 0130 00 and later uploaded a notarized document with registration number NRP-IND-2024-556782.",
    "validators": [
        {
            "type": "uli_slur_match",
            "severity": "all"
        },
        {
            "type": "ban_list",
            "banned_words": [
                "sonography"
            ]
        },
        {
            "type": "pii_remover",
            "on_fail": "rephrase"
        }
    ]
}

Response - 
{
  "success": true,
  "data": {
    "response_id": "60de9ca4-d486-4b62-afae-ad6dde39d605",
    "rephrase_needed": true,
    "safe_text": "Please rephrase the query without unsafe content. PII detected in the text."
  }
}

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features

    • Added a user-facing "rephrase" action to return a safer rewording prompt.
  • API Changes

    • Main guardrails endpoint standardized to /validate/ and responses now include response_id, rephrase_needed, and safe_text; validator listing clarified.
  • Behavior & Config

    • New on-fail option introduced with a simplified default and a rephrase prompt text.
  • Tests

    • Updated and added tests to cover the rephrase action and new response shape.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

This comment was marked as resolved.

@rkritika1508 rkritika1508 self-assigned this Jan 21, 2026
@rkritika1508 rkritika1508 linked an issue Jan 22, 2026 that may be closed by this pull request
@rkritika1508 rkritika1508 marked this pull request as ready for review January 22, 2026 21:18
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@rkritika1508 rkritika1508 force-pushed the feat/rephrase-action branch 2 times, most recently from 4e65944 to 65a5c23 Compare January 27, 2026 17:28
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/models/base_validator_config.py`:
- Around line 20-21: The resolve_on_fail method currently indexes _ON_FAIL_MAP
with self.on_fail causing a KeyError; change it to explicitly validate
self.on_fail against _ON_FAIL_MAP (e.g., if self.on_fail not in _ON_FAIL_MAP)
and raise a clear ValueError or custom exception that includes the invalid value
and the list of allowed keys, then return _ON_FAIL_MAP[self.on_fail] otherwise;
reference resolve_on_fail, _ON_FAIL_MAP and self.on_fail when making the change.

@rkritika1508 rkritika1508 changed the title Add "rephrase" on_fail action Updated endpoint and added "rephrase" on_fail action Jan 28, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@backend/app/api/routes/guardrails.py`:
- Around line 117-126: The returned safe_text currently includes the internal
REPHRASE_ON_FAIL_PREFIX marker; before constructing GuardrailResponse in this
block (variables validated_output, rephrase_needed, REPHRASE_ON_FAIL_PREFIX,
response_model), strip that prefix when rephrase_needed is true (e.g., remove
the prefix and any leading whitespace) so GuardrailResponse.safe_text contains
the clean prompt; then set rephrase_needed unchanged and pass the stripped value
into GuardrailResponse.
🧹 Nitpick comments (1)
backend/app/api/routes/guardrails.py (1)

46-68: Make /validators/ success response shape consistent with failures.

Failures return APIResponse.failure_response(...), but success returns a raw dict. Consider wrapping success in APIResponse.success_response(...) (and optionally adding a response_model) to keep the contract consistent.

💡 Example adjustment
-    return {"validators": validators}
+    return APIResponse.success_response(data={"validators": validators})

@nishika26 nishika26 merged commit d5ef86d into main Jan 28, 2026
1 check passed
@nishika26 nishika26 deleted the feat/rephrase-action branch January 28, 2026 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove "input" and "output" endpoint and create just one endpoint for running guardrails Add "rephrase" on_fail action

4 participants