27 Jun 2024 | Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Tingyang Xu, Bingzhe Wu, Zhong Zhang, Liu Liu, Yatao Bian, Xueqian Wang, Peilin Zhao
This paper introduces Step-On-Feet Tuning (SOFT), a method for improving the self-alignment of large language models (LLMs) through bootstrapping. The key idea is to leverage the model's continuously enhanced few-shot ability to boost zero or one-shot performance. SOFT addresses the challenge of maintaining model performance during repeated self-training, which can lead to model collapse if not managed properly. The method involves using a diverse in-context learning (ICL) example pool, training in an easy-to-hard order, and incorporating a validation set to detect potential model collapse.
The study finds that bootstrapping self-alignment significantly outperforms single-round approaches, especially when the ICL examples are diverse and informative. By adjusting the training order and using a carefully curated ICL example pool, the model's performance is improved. However, the study also identifies that performance may decline in later stages due to factors such as data processing inequality and sharper output distribution. To mitigate this, a validation set is introduced to detect and prevent further performance degradation.
SOFT is evaluated on multiple benchmarks, including HHH Eval, Truthful QA, Alpaca Eval, Vicuna Bench, and MT-Bench. The results show that SOFT outperforms existing self-aligned models, achieving higher accuracy and performance on various tasks. The method is effective in improving model alignment and self-training performance, while also highlighting the importance of maintaining data diversity and managing model training to avoid collapse. The study contributes to the understanding of self-training loops and provides a practical approach for enhancing LLM self-alignment through bootstrapping.This paper introduces Step-On-Feet Tuning (SOFT), a method for improving the self-alignment of large language models (LLMs) through bootstrapping. The key idea is to leverage the model's continuously enhanced few-shot ability to boost zero or one-shot performance. SOFT addresses the challenge of maintaining model performance during repeated self-training, which can lead to model collapse if not managed properly. The method involves using a diverse in-context learning (ICL) example pool, training in an easy-to-hard order, and incorporating a validation set to detect potential model collapse.
The study finds that bootstrapping self-alignment significantly outperforms single-round approaches, especially when the ICL examples are diverse and informative. By adjusting the training order and using a carefully curated ICL example pool, the model's performance is improved. However, the study also identifies that performance may decline in later stages due to factors such as data processing inequality and sharper output distribution. To mitigate this, a validation set is introduced to detect and prevent further performance degradation.
SOFT is evaluated on multiple benchmarks, including HHH Eval, Truthful QA, Alpaca Eval, Vicuna Bench, and MT-Bench. The results show that SOFT outperforms existing self-aligned models, achieving higher accuracy and performance on various tasks. The method is effective in improving model alignment and self-training performance, while also highlighting the importance of maintaining data diversity and managing model training to avoid collapse. The study contributes to the understanding of self-training loops and provides a practical approach for enhancing LLM self-alignment through bootstrapping.