CodeChameleon

This repository contains the code implementation for the paper CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models.

🛠️ Usage

✨An example for jailbreaking the LLms:

python attack.py \
    --model_path gpt-3.5-turbo-1106 \
    --problem_path data/test_problem.csv \
    --save_path jailbreak_output \
    --encrypt_rule binary_tree \
    --prompt_style  code \
    --max_new_tokens 1024 \
    --do_sample \
    --temperature 1 \
    --repetition_penalty 1.0 \
    --top_p 0.9 \
    --use_cache \

✨An example for evaluating the results:

python gpt_evaluate.py \
    --problem_path data/test_problem.csv \
    --response_path jailbreak_output/llama2/7B/code_reverse.csv \
    --max_new_tokens 1024 \
    --temperature 1 \
    --top_p 0.9 \

🔧 Argument Specification

--model_path: The name of the model to evaluate.
--problem_path: The path of malicious problems.
--save_path: Relative path to save jailbreak results.
--encrypt_rule: Select the encrypt methods.
--prompt_style: The style of instructions (code or text).
--response_path: The path to the jailbreak output that needs to be evaluated.

The remaining parameters are all relevant parameters in the model's inference stage.

🖊️ Citation

@article{lv2024codechameleon,
  title={CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models},
  author={Lv, Huijie and Wang, Xiao and Zhang, Yuansen and Huang, Caishuang and Dou, Shihan and Ye, Junjie and Gui, Tao and Zhang, Qi and Huang, Xuanjing},
  journal={arXiv preprint arXiv:2402.16717},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
paper		paper
.DS_Store		.DS_Store
README.md		README.md
attack.py		attack.py
decrypt.py		decrypt.py
encrypt.py		encrypt.py
gpt_evaluate.py		gpt_evaluate.py
template.py		template.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodeChameleon

🛠️ Usage

🔧 Argument Specification

🖊️ Citation

Über uns

Releases

Packages

Contributors 2

Languages

huizhang-L/CodeChameleon

Folders and files

Latest commit

History

Repository files navigation

CodeChameleon

🛠️ Usage

🔧 Argument Specification

🖊️ Citation

Über uns

Ressourcen

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages