-
Notifications
You must be signed in to change notification settings - Fork 579
Description
Is your feature request related to a problem? Please describe.
Currently, converters like CharSwapConverter
, or other prompt perturbation tools in PyRIT, are only usable before prompts are sent to the model. However, when using the OpenAI Responses API with function/tool calling enabled, there is no way for the model to dynamically request text transformations like perturbations, mutations, or obfuscation during a conversation.
This limits experimentation with model-driven robustness, self-evaluation, and adversarial reasoning patterns, especially in multi-agent scenarios where tool flexibility is key.
Describe the solution you'd like
I propose exposing converters (like CharSwapConverter
, etc.) as function-callable tools, so they can be used within the custom_functions
registry of OpenAIResponseTarget
.
This would allow the model (or one of the agents) to call a converter dynamically via tool calling, enabling dynamic, contextual adversarial attacks or robustness tests during the conversation.
For example:
converter_tool = {
"type": "function",
"name": "apply_char_swap",
"description": "Apply character swaps to words for robustness testing",
"parameters": {
"type": "object",
"properties": {
"text": {"type": "string", "description": "Input text to perturb"},
},
"required": ["text"],
}
}
The corresponding Python implementation can wrap the CharSwapConverter
internally and return the perturbed text.
Additional context
This would integrate cleanly into the existing multi-agent orchestrator pipeline (#930), where:
- The Recon Agent gathers environmental/system info but doesn’t invoke transformations.
- The Strategy Agent analyzes context and decides which tools or mutations to apply.
- The AI Red Team Agent performs the actual tool call, such as invoking a converter, and delivers the perturbed prompt to the target model.
Note: This proposal does not replace or modify existing converters. Instead, it wraps them with lightweight async functions compatible with the Responses API's custom_functions registry. The converters are reused as-is, plugged into a callable interface, so the model can invoke them during conversation turns, just like any other tool.