Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

27 May 2024 | Yiran Zhao, Wenyue Zheng, Tianle Cai, Xuan Long Do, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh
This paper proposes a new algorithm called Probe Sampling to accelerate the Greedy Coordinate Gradient (GCG) algorithm, which is used to construct adversarial prompts to break Large Language Models (LLMs). The main idea of Probe Sampling is to use a smaller draft model to filter out unpromising prompt candidates, thereby reducing the time cost of the optimization process. The algorithm dynamically determines how many candidates to keep at each iteration based on the agreement score between the draft model and the target model. This approach significantly reduces the running time of GCG while achieving better Attack Success Rate (ASR). Specifically, with Llama2-7b-Chat, Probe Sampling achieves 3.5 times speedup and an improved ASR of 81.0 compared to GCG with 69.0 ASR. When combined with simulated annealing, Probe Sampling achieves a speedup of 5.6 times with a better ASR of 74.0. Probe Sampling is also effective in accelerating other prompt optimization techniques and adversarial methods, leading to acceleration of 1.8× for AutoPrompt, 2.4× for APE, and 2.4× for AutoDAN. The algorithm is evaluated on the AdvBench dataset with Llama2-7b-Chat and Vicuna-v1.3 as the target models and a significantly smaller model GPT-2 as the draft model. The results show that Probe Sampling significantly reduces the running time of GCG while achieving better ASR. Additionally, the algorithm is applied to other prompt optimization methods, including AutoPrompt and APE, and demonstrates effective acceleration. The results show that Probe Sampling achieves significant speed improvements on various datasets, with a speedup of 2.3× on GSM8K, 1.8× on MMLU, and 3.0× on BBH for APE. The algorithm also achieves a speedup of 2.3× for AutoDAN-GA and 2.5× for AutoDAN-HGA. The paper also discusses the computational details of Probe Sampling, including memory allocation and time allocation. The results show that Probe Sampling uses a similar amount of memory to the original GCG algorithm, although it involves extra procedures and an extra model. The time allocation shows that Probe Sampling can be parallelized, leading to relatively little overhead. The paper also discusses the effectiveness of Probe Sampling in different scenarios, including the use of different draft models and probe set sizes. The results show that using a probe set size of B/16 leads to accurate probe agreement measurement while achieving significant acceleration. The paper also discusses the limitations of Probe Sampling, including its relatively slow performance on large-sized test sets and its limitation to supporting only open-source models. The paper concludes that Probe Sampling is an effective algorithm for accelerating GCG and can be applied to other prompt optimization methods and adversarial techniques.This paper proposes a new algorithm called Probe Sampling to accelerate the Greedy Coordinate Gradient (GCG) algorithm, which is used to construct adversarial prompts to break Large Language Models (LLMs). The main idea of Probe Sampling is to use a smaller draft model to filter out unpromising prompt candidates, thereby reducing the time cost of the optimization process. The algorithm dynamically determines how many candidates to keep at each iteration based on the agreement score between the draft model and the target model. This approach significantly reduces the running time of GCG while achieving better Attack Success Rate (ASR). Specifically, with Llama2-7b-Chat, Probe Sampling achieves 3.5 times speedup and an improved ASR of 81.0 compared to GCG with 69.0 ASR. When combined with simulated annealing, Probe Sampling achieves a speedup of 5.6 times with a better ASR of 74.0. Probe Sampling is also effective in accelerating other prompt optimization techniques and adversarial methods, leading to acceleration of 1.8× for AutoPrompt, 2.4× for APE, and 2.4× for AutoDAN. The algorithm is evaluated on the AdvBench dataset with Llama2-7b-Chat and Vicuna-v1.3 as the target models and a significantly smaller model GPT-2 as the draft model. The results show that Probe Sampling significantly reduces the running time of GCG while achieving better ASR. Additionally, the algorithm is applied to other prompt optimization methods, including AutoPrompt and APE, and demonstrates effective acceleration. The results show that Probe Sampling achieves significant speed improvements on various datasets, with a speedup of 2.3× on GSM8K, 1.8× on MMLU, and 3.0× on BBH for APE. The algorithm also achieves a speedup of 2.3× for AutoDAN-GA and 2.5× for AutoDAN-HGA. The paper also discusses the computational details of Probe Sampling, including memory allocation and time allocation. The results show that Probe Sampling uses a similar amount of memory to the original GCG algorithm, although it involves extra procedures and an extra model. The time allocation shows that Probe Sampling can be parallelized, leading to relatively little overhead. The paper also discusses the effectiveness of Probe Sampling in different scenarios, including the use of different draft models and probe set sizes. The results show that using a probe set size of B/16 leads to accurate probe agreement measurement while achieving significant acceleration. The paper also discusses the limitations of Probe Sampling, including its relatively slow performance on large-sized test sets and its limitation to supporting only open-source models. The paper concludes that Probe Sampling is an effective algorithm for accelerating GCG and can be applied to other prompt optimization methods and adversarial techniques.
Reach us at info@study.space