Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

27 Jun 2024 | Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Tingyang Xu, Bingzhe Wu, Zhong Zhang, Liu Liu, Yatao Bian, Xueqian Wang, Peilin Zhao
This paper explores the effectiveness of bootstrapping self-alignment in large language models (LLMs) to reduce the cost of human annotation while maintaining or improving model performance. The authors investigate the impact of in-context learning (ICL) examples on the self-alignment process and find that diverse and informative ICL examples are crucial for effective bootstrapping. They propose Step-On-Feet Tuning (SOFT), a method that leverages the continuous improvement of the model's few-shot ability to enhance zero or one-shot performance. Key contributions include: 1. **Diverse ICL Examples**: The authors emphasize the importance of diverse ICL examples to prevent overfitting and improve model performance. 2. **Multi-Round Bootstrapping**: They demonstrate that multi-round bootstrapping significantly enhances model performance compared to single-round approaches. 3. **Easy-to-Hard Training Order**: Adjusting the training order from easy to hard tasks further improves model performance by reducing error accumulation. 4. **Early Stop Mechanism**: They introduce a validation set to detect potential model collapse and implement early stopping to prevent further performance degradation. The paper includes extensive experiments on various benchmarks, showing that SOFT outperforms single-round self-alignment and even some distilled models. The authors conclude that bootstrapping self-alignment is effective when provided with diverse and fresh ICL examples, and their method, SOFT, significantly reduces the need for human annotation while improving model performance.This paper explores the effectiveness of bootstrapping self-alignment in large language models (LLMs) to reduce the cost of human annotation while maintaining or improving model performance. The authors investigate the impact of in-context learning (ICL) examples on the self-alignment process and find that diverse and informative ICL examples are crucial for effective bootstrapping. They propose Step-On-Feet Tuning (SOFT), a method that leverages the continuous improvement of the model's few-shot ability to enhance zero or one-shot performance. Key contributions include: 1. **Diverse ICL Examples**: The authors emphasize the importance of diverse ICL examples to prevent overfitting and improve model performance. 2. **Multi-Round Bootstrapping**: They demonstrate that multi-round bootstrapping significantly enhances model performance compared to single-round approaches. 3. **Easy-to-Hard Training Order**: Adjusting the training order from easy to hard tasks further improves model performance by reducing error accumulation. 4. **Early Stop Mechanism**: They introduce a validation set to detect potential model collapse and implement early stopping to prevent further performance degradation. The paper includes extensive experiments on various benchmarks, showing that SOFT outperforms single-round self-alignment and even some distilled models. The authors conclude that bootstrapping self-alignment is effective when provided with diverse and fresh ICL examples, and their method, SOFT, significantly reduces the need for human annotation while improving model performance.
Reach us at info@study.space
[slides] Step-On-Feet Tuning%3A Scaling Self-Alignment of LLMs via Bootstrapping | StudySpace