MAGPIE: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

MAGPIE: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

12 Jun 2024 | Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin
MAGPIE is a novel method for generating large-scale alignment data by synthesizing high-quality instruction data directly from aligned large language models (LLMs). The key observation is that aligned LLMs can generate user queries when only the left-side templates up to the position reserved for user messages are input, due to their auto-regressive nature. This method, named MAGPIE, was applied to Llama-3-8B-Instruct and Llama-3-70B-Instruct models to create two instruction datasets: MAGPIE-Air and MAGPIE-Pro. The datasets were then fine-tuned on the Llama-3-8B-Base model, and their performance was evaluated against other public instruction datasets and preference tuning strategies. Results show that models fine-tuned with MAGPIE achieve superior performance, even surpassing the official Llama-3-8B-Instruct model, which was fine-tuned with over 10 million data points. MAGPIE also outperforms previous public datasets in terms of data quantity and quality, demonstrating its effectiveness in enhancing the instruction-following capabilities of LLMs. The method is fully automated, cost-effective, and scalable, making it a promising approach for creating high-quality alignment datasets.MAGPIE is a novel method for generating large-scale alignment data by synthesizing high-quality instruction data directly from aligned large language models (LLMs). The key observation is that aligned LLMs can generate user queries when only the left-side templates up to the position reserved for user messages are input, due to their auto-regressive nature. This method, named MAGPIE, was applied to Llama-3-8B-Instruct and Llama-3-70B-Instruct models to create two instruction datasets: MAGPIE-Air and MAGPIE-Pro. The datasets were then fine-tuned on the Llama-3-8B-Base model, and their performance was evaluated against other public instruction datasets and preference tuning strategies. Results show that models fine-tuned with MAGPIE achieve superior performance, even surpassing the official Llama-3-8B-Instruct model, which was fine-tuned with over 10 million data points. MAGPIE also outperforms previous public datasets in terms of data quantity and quality, demonstrating its effectiveness in enhancing the instruction-following capabilities of LLMs. The method is fully automated, cost-effective, and scalable, making it a promising approach for creating high-quality alignment datasets.
Reach us at info@study.space
[slides and audio] Magpie%3A Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing