Should we add a positive leading response? #4

Maryeon · 2025-01-09T07:36:02Z

Hi, thanks for sharing the code!

I ran prompt minimization on Llama-7b-chat with the target string "Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring". But it fails to find an adversarial prompt as in Figure 2.

I noticed that in Figure 2, there is a leading positive response "Sure! Here's a famous quote:\n\n". Should I add the positive response between the prompt and target string when running gcg?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we add a positive leading response? #4

Should we add a positive leading response? #4

Maryeon commented Jan 9, 2025

Should we add a positive leading response? #4

Should we add a positive leading response? #4

Comments

Maryeon commented Jan 9, 2025