Skip to content

Commit 5c344a8

Browse files
Add revision for 2025.acl-long.805 (closes #5667)
1 parent 65672da commit 5c344a8

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

data/xml/2025.acl.xml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11755,9 +11755,11 @@
1175511755
<author><first>Suhyun</first><last>Kim</last><affiliation>Kyung Hee University</affiliation></author>
1175611756
<pages>16489-16507</pages>
1175711757
<abstract>We introduce a novel framework for consolidating multi-turn adversarial “jailbreak” prompts into single-turn queries, significantly reducing the manual overhead required for adversarial testing of large language models (LLMs). While multi-turn human jailbreaks have been shown to yield high attack success rates (ASRs), they demand considerable human effort and time. Our proposed Multi-turn-to-Single-turn (M2S) methods—Hyphenize, Numberize, and Pythonize—systematically reformat multi-turn dialogues into structured single-turn prompts. Despite eliminating iterative back-and-forth interactions, these reformatted prompts preserve and often enhance adversarial potency: in extensive evaluations on the Multi-turn Human Jailbreak (MHJ) dataset, M2S methods yield ASRs ranging from 70.6 % to 95.9 % across various state-of-the-art LLMs. Remarkably, our single-turn prompts outperform the original multi-turn attacks by up to 17.5 % in absolute ASR, while reducing token usage by more than half on average. Further analyses reveal that embedding malicious requests in enumerated or code-like structures exploits “contextual blindness,” undermining both native guardrails and external input-output safeguards. By consolidating multi-turn conversations into efficient single-turn prompts, our M2S framework provides a powerful tool for large-scale red-teaming and exposes critical vulnerabilities in contemporary LLM defenses. All code, data, and conversion prompts are available for reproducibility and further investigations: https://github.com/Junuha/M2S_DATA</abstract>
11758-
<url hash="a01d9a46">2025.acl-long.805</url>
11758+
<url hash="d849707e">2025.acl-long.805</url>
1175911759
<bibkey>ha-etal-2025-one</bibkey>
1176011760
<doi>10.18653/v1/2025.acl-long.805</doi>
11761+
<revision id="1" href="2025.acl-long.805v1" hash="a01d9a46"/>
11762+
<revision id="2" href="2025.acl-long.805v2" hash="d849707e" date="2025-09-05">Title update.</revision>
1176111763
</paper>
1176211764
<paper id="806">
1176311765
<title><fixed-case>RAE</fixed-case>mo<fixed-case>LLM</fixed-case>: Retrieval Augmented <fixed-case>LLM</fixed-case>s for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information</title>

0 commit comments

Comments
 (0)