New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] #58

ayukh · 2024-02-01T12:47:34Z

Creating PR in advance to track progress, working on discussed LVE currently.

New visual_injection LVE for GPT-4V: can inject prompts with barely visible text for humans, however, GPT-4V is still able to recognize it (reproduced from source: https://twitter.com/goodside/status/1713000581587976372)
Tried to do prompt injection with non-printable characters for GPT-4 - currently works in playground for the version gpt-4 used in LVE package. but it seems to be fixed for the new version gpt-4-0125-preview. UPD: I tried different encodings for processing the prompt and it does not work, I am not sure how ChatGPT does this, but it reads those non-printable characters in a way that they are being readable by the model, I could not do that in CLI setup
I also tested another prompt_injection LVE for GPT-4V - it is possible to inject prompts through image file name; source: https://twitter.com/elder_plinius/status/1752259695015022718 - I am afraid it cannot work in LVE framework, because it only works in chat mode with multiple prompts so far
Key reasons why it works in original tweet: delayed trigger+image file name is passed into user prompt looks like, unlike API

ayukh · 2024-02-23T14:53:36Z

Update
New LVE for GPT-4V: prompt injection in images with code - can inject prompts in code comments when passing it as an image to gpt-4-vision. Source

ayukh · 2024-03-04T08:48:24Z

New LVE for GPT-3.5/4: ascii_art_injection: we inject prompts using ascii art text (in paper can make GPT-3.5 disclose how to make counterfeit money). Somehow breaks more often when replacing 'counterfeit' with 'fake' (check gpt-4 version)
Source: ArtPrompt paper

ayukh · 2024-03-12T13:47:56Z

New LVE for GPT-4V: FigStep - similar to ASCII art, we can prompt model to decode text from images and plug it into prompt. Prompt question is built in the form of numbered list and model is prompted to complete the list with the step-by-step instruction.
Source: FigStep paper

New LVE: security/visual_hidden_text [gpt-4-vision-preview]

eb2a5a2

ayukh changed the title ~~New LVE: security/visual_hidden_text [gpt-4-vision-preview]+in progress~~ New LVE: security/visual_hidden_text [gpt-4-vision-preview] Feb 21, 2024

New LVE: security\img_code_injection [openai/gpt-4-vision-preview]

ed5f3dd

ayukh changed the title ~~New LVE: security/visual_hidden_text [gpt-4-vision-preview]~~ New LVEs: security/visual_hidden_text+img_code_injection [gpt-4-vision-preview] Feb 23, 2024

ayukh added 3 commits March 4, 2024 00:14

New LVE [gpt-3.5-turbo]: security\prompt_injection\ascii_art_injection

70d058d

Add aource to LVE [gpt-3.5-turbo]: security\prompt_injection\ascii_art

452663e

Update LVE [gpt-4]: security\prompt_injection\ascii_art_injection

4f609f3

ayukh changed the title ~~New LVEs: security/visual_hidden_text+img_code_injection [gpt-4-vision-preview]~~ New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] Mar 4, 2024

ayukh added 3 commits March 12, 2024 14:29

add author for recorded LVEs

8184c4e

New LVE [gpt-4V]: security/prompt_injection/figstep

5798b57

Update tags and arxiv links for visual LVEs

6f22f55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] #58

New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] #58

Uh oh!

ayukh commented Feb 1, 2024 •

edited

Loading

Uh oh!

ayukh commented Feb 23, 2024

Uh oh!

ayukh commented Mar 4, 2024 •

edited

Loading

Uh oh!

ayukh commented Mar 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] #58

Are you sure you want to change the base?

New LVEs: security/prompt_injection [gpt-4-vision-preview/gpt-3.5-turbo/gpt-4] #58

Uh oh!

Conversation

ayukh commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayukh commented Feb 23, 2024

Uh oh!

ayukh commented Mar 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayukh commented Mar 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ayukh commented Feb 1, 2024 •

edited

Loading

ayukh commented Mar 4, 2024 •

edited

Loading