·
57 commits
to main
since this release
What's Changed
Targets
- Extend
HTTPTargetto allow custom HTTP client - Added prompt target for OpenAI Sora--
OpenAISoraTarget - Added prompt target for OpenAI prompt response target--
OpenAIResponseTarget
Datasets
- Added
equitymedqa_dataset - Added
sosbench_dataset - Added
ccp_sensitive_prompts_dataset - Added
medsafetybench_dataset - Added
transphobia_awareness_dataset - Added
jbb_behaviors_dataset
Converters
DenyListConverter: takes a list of words that will prohibited from being used in the prompt- Introduce word level converter which provides a reusable foundation that standardizes word selection for transformation and reduces code duplication across similar converters.
SuperscriptConverterwhich converts text to superscriptTextJailBreakConverterFirstLetterConverterwhich removes all but the first letter of each word in a stringImageCompressionConverterwhich enables compression of image files to reduce their size while preserving visual quality.RandomTranslationConverterwhich translates each word in a prompt to a random language from a pre-defined or user-provided list of languages.
Attacks
- Breaking: Refactor orchestration components in favor of executors. See docs here for full details on the updated interface: executors
- Allow repetition support in Question Answer Benchmark
- Integrate the XPIA attack with AI Recruiter
- Add Anecdoctor attack which constructs attack prompts based on real-world examples
- Add adversarial and Pruned Conversations to
AttackResult
Scorers
LookBackScorer: uses entire conversation as scoring contextPlagiarismScorer: determines whether the content is similar to reference text- Support for evaluating each scorer
Scanner
- Converter, target and scorer support added
Other
- Breaking: DuckDB with SQLite
- GitHub Copilot Instructions for PyRIT Development
- Added support to analyze the results of an attack
- Extend data exporter to support Markdown
Full list of changes
- MAINT post-v0.9.1.dev0 release updates by @nina-msft in #915
- FEAT Addition of LookBackScorer which scores using the entire conversation as context. by @whackswell in #906
- DOC Update Releasing PyRIT Documentation by @nina-msft in #916
- Breaking FEAT: Refactoring Single turn objective by @rlundeen2 in #892
- FIX: fixing integration tests by @rlundeen2 in #920
- FEAT: Add denylist converter by @hannahwestra25 in #924
- MAINT: Adding DB Schema Diagram by @jbolor21 in #921
- FEAT: Add Converter Support to Scanner by @nina-msft in #882
- FIX: Added Azure Speech dependencies to the Dev Container by @bashirpartovi in #932
- TEST: add test for print_conversation_async with include_auxiliary_scores by @hannahwestra25 in #928
- FEAT Adding flag parameter to LookBackScorer by @whackswell in #918
- [MAINT] Explicit Optional Parameters by @hannahwestra25 in #927
- DOC fix citation for decoding trust dataset by @romanlutz in #937
- FEAT: Equity Med Dataset by @jbolor21 in #922
- MAINT replace pylint dev commit with latest version by @romanlutz in #942
- fix integration test with new PSO method by @romanlutz in #941
- MAINT bump target API versions by @romanlutz in #938
- MAINT bump package versions by @romanlutz in #939
- FIX: Retry bug with single turn retry by @rlundeen2 in #943
- FEAT extend http target to allow custom http client by @ayeganov in #804
- MAINT Clean up Example SeedPrompt Datasets by @nina-msft in #944
- FIX: Change realtime target api_version by @jsong468 in #946
- FEAT: Integrate XPIATestOrchestrator with the AI Recruiter by @KutalVolkan in #684
- FEAT Question answer benchmark repeated question support by @AdrGav941 in #933
- DOC Correcting benchmark orchestrator notebook by @AdrGav941 in #952
- BREAKING FEAT: introduce word-level converter by @paulinek13 in #847
- DOC: add blog post for XPIAOrchestrator with AI Recruiter by @KutalVolkan in #716
- FIX: BinaryConverter convert_word_async by @jsong468 in #953
- Refactoring Orchestrator module as Attacks by @bashirpartovi in #945
- MAINT Removed achieved_objective field from context by @bashirpartovi in #956
- FIX fixing pre-commit for windows by @bashirpartovi in #957
- FEAT deprecated prompt sending and red teaming orchestrators by @bashirpartovi in #955
- FEAT: add superscript converter by @paulinek13 in #818
- FEAT: Adding TextJailBreakConverter by @rlundeen2 in #947
- MAINT standardize logging in GCG attack modules by @saishreyakumar in #966
- FEATURE: New Prompt Target for OpenAI's Sora by @nina-msft in #954
- MAINT: DALLE Content Filter Check by @jbolor21 in #968
- FEAT Crescendo Attack Refactor by @bashirpartovi in #970
- FEAT Simplified Attack Usage by @bashirpartovi in #973
- FIX fixing pre-commit windows job to use cache by @bashirpartovi in #976
- FIX ensure every dataset has an integration test, fix equitymedqa by @romanlutz in #981
- FIX replace orchestrator ID query in prompt shield notebook by @romanlutz in #983
- FIX correct decoding trust data path by @romanlutz in #982
- FIX scanner support for non-text inputs by @romanlutz in #980
- MAINT improved error messages for target validation by @romanlutz in #984
- DOC: improve API reference for
prompt_convertermodule by @paulinek13 in #969 - FEAT adding SOS-Bench dataset by @amandaleesherman in #974
- DOC move class-level docstring arguments to constructor docstring by @romanlutz in #986
- [FEAT] CCP-Sensitive-Prompts Dataset Integration by @awksrj in #959
- MAINT remove unnecessary ABC by @romanlutz in #988
- DOC fixing docstrings for FuzzerOrchestrator by @Sarayu-code in #971
- FIX fix target integration test by setting supports_seed to False for ministral in Azure by @romanlutz in #996
- FEAT: Adding AttackResult to the Database by @rlundeen2 in #995
- FEAT Refactoring TreeOfAttacks with the new AttackStrategy by @bashirpartovi in #992
- FEAT add OpenAI Response Target by @romanlutz in #935
- FEAT: Add anecdoctor orchestrator to build attack prompts from real-world examples. by @migdaepp in #913
- FEAT adding transphobia awareness dataset by @varshini2305 in #989
- FIX: OpenAIChatTarget inheritance by @rlundeen2 in #1001
- FEAT: Scorer Evaluations by @jsong468 in #934
- FIX: Auxiliary scores in PromptSendingAttack by @rlundeen2 in #1004
- FIX RTO was not honoring prepended conversations by @bashirpartovi in #1009
- FIX Use custom prompt regardless of turn count in RedTeamingAttack by @bashirpartovi in #1013
- FEAT: FlipAttack Refactor by @jsong468 in #1010
- FEAT: add image compression converter by @paulinek13 in #1000
- FIX get AzureML pipeline for GCG working again by @romanlutz in #1012
- FEAT: Migrate FuzzerOrchestrator to FuzzerAttack by @bashirpartovi in #1015
- FEAT: ManyShotJailbreak Refactor by @jsong468 in #1017
- MAINT: Deprecation of ManyShotJailbreakOrchestrator by @jsong468 in #1019
- FIX: Small edits to FlipAttack by @jsong468 in #1020
- FIX: Fixing Scorer Memory Add and Validate by @rlundeen2 in #1018
- DOC - Update 1_installation.md by @blahdeblahde in #1011
- Refactored SkeletonKeyOrchestrator as an attack by @bashirpartovi in #1021
- FEAT: Refactor ContextComplianceOrchestrator as ContextComplianceAttack by @nina-msft in #1022
- MAINT: Deprecate ContextComplianceOrchestrator by @nina-msft in #1024
- FEAT Added analyze_results by @Sarayu-code in #1003
- FIX for code scanning alert no. 7 and 8: Workflow does not contain permissions by @romanlutz in #1030
- FEAT AttackResultPrinter interface + console AttackResult printer by @bashirpartovi in #1028
- FEAT added a few improvements to the devcontainer by @bashirpartovi in #1032
- FEAT: Adding Adversarial and Pruned Conversations to AttackResult by @rlundeen2 in #1002
- FIX: Set
max_attempts_on_failurein Compliance Context attack by @nina-msft in #1026 - FEAT Make target and scorer args in scanner configurable by @Knight-Ops in #1023
- FIX remove print statements where it should be logging by @romanlutz in #1029
- Refactoring AnecdoctorOrchestrator as AnecdoctorAttack by @bashirpartovi in #1025
- FIX: Handle negative array indices in HTTP target callback regex pattern by @ab-halfspace in #1014
- FIX Conversation Manager should properly apply converters by @bashirpartovi in #1035
- FEAT: Role Playing Orchestrator Refactor into Role Playing Attack by @jbolor21 in #987
- FIX: Stop tracking .vscode/settings.json by @rlundeen2 in #1034
- MAINT: add Anthropic integration tests by @romanlutz in #1038
- FIX: Adding related_conversations to AttackResult for single turn by @rlundeen2 in #1039
- MAINT refactored Scoring Orchestrator by @bashirpartovi in #1044
- FIX: passing in target for update_conversation_state_async by @jsong468 in #1048
- FEAT added batch execution for multi-turn and single-turn attacks by @bashirpartovi in #1051
- FIX: Minor fixes for integration tests by @jsong468 in #1055
- MAINT: Breaking Refactoring MemoryInterface Scores Functions by @rlundeen2 in #1056
- FIX for code scanning alert no. 9: Clear-text logging of sensitive information by @romanlutz in #1058
- DOC: Update Docs to call PromptSendingAttack by @nina-msft in #1050
- FIX (TEMPORARY): Temporary scoring fix to only score first piece of multi-piece PromptRequestResponse by @jsong468 in #1060
- FEAT Restructuring Attacks by @bashirpartovi in #1059
- FEAT Add FirstLetterConverter by @fdubut in #1061
- FEAT Add GitHub Copilot Instructions for PyRIT Development by @bashirpartovi in #1052
- Align SeedPrompt with PromptRequestPiece by @hannahwestra25 in #1053
- MAINT: OPENAI_RESPONSES environment variable change by @jsong468 in #1065
- FEAT Refactored XPIA orchestrator as a workflow by @bashirpartovi in #1062
- [DOC] add SeedPrompt and PromptRequestPiece images and correct alt tags by @hannahwestra25 in #1063
- FEAT Add JailbreakBench/JBB-Behaviors dataset #1008 by @ChirayuXD in #1045
- FEAT: Extend data exporter to support Markdown (#1033) by @1twodrei in #1042
- FEAT Refactored QA Benchmark Orchestrator as a Strategy by @bashirpartovi in #1066
- FIX: Updating api.rst by @jsong468 in #1067
- FIX: Fix fetch_jbb_behaviors_dataset by @jsong468 in #1069
- FEAT: added medsafetybench dataset by @nthsneha in #993
- DOC: Create new Documentation for Orchestrator Doc Pages (Pt. 1) by @nina-msft in #1068
- FIX: Fetch MedSafetyBench fix by @jsong468 in #1074
- FEAT Add separator support for FirstLetterConverter by @fdubut in #1070
- FIX: HuggingFace and SQLAlchemy pre-commit issues by @jsong468 in #1075
- [TEST] replace orchestrator references in tests by @hannahwestra25 in #1072
- FEAT Random translation converter by @fdubut in #1076
- FEAT PlagiarismScorer and demo notebook on probing for copyright violations with the FirstLetterConverter by @blakebullwinkel in #1073
- FIX: Replace expired blob storage links by @jsong468 in #1079
- DOC: Create new Documentation for Orchestrator Doc Pages (Pt. 2) by @nina-msft in #1077
- DOC Add benchmarking cookbook by @fdubut in #1078
- [FIX] fix cli integration tests by @hannahwestra25 in #1082
- DOC/MAINT: Remove all remaining references to orchestrator (except
orchestrator_id) by @nina-msft in #1080 - [BREAKING] FEAT: Replacing DuckDB with SQLite by @jbolor21 in #1054
- FEAT Attack Result Markdown Printer by @bashirpartovi in #1081
- [FIX] update custom uuid for azure sql by @hannahwestra25 in #1089
- DOC Create RSS feed for blog by @fdubut in #1090
New Contributors
- @afogel made their first contribution in #857
- @dennis-rall made their first contribution in #880
- @whackswell made their first contribution in #878
- @emmanuel-ferdman made their first contribution in #885
- @devesh-2002 made their first contribution in #834
- @elisetreit made their first contribution in #883
- @0xm00n made their first contribution in #893
- @saishreyakumar made their first contribution in #966
- @amandaleesherman made their first contribution in #974
- @awksrj made their first contribution in #959
- @Sarayu-code made their first contribution in #971
- @migdaepp made their first contribution in #913
- @varshini2305 made their first contribution in #989
- @blahdeblahde made their first contribution in #1011
- @Knight-Ops made their first contribution in #1023
- @ab-halfspace made their first contribution in #1014
- @fdubut made their first contribution in #1061
- @ChirayuXD made their first contribution in #1045
- @1twodrei made their first contribution in #1042
- @nthsneha made their first contribution in #993
Full Changelog: https://github.com/Azure/PyRIT/commits/v0.10.0rc0