25 May 2023 | Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
SELF-INSTRUCT is a framework that improves the instruction-following capabilities of pretrained language models by generating instructions from the model itself. The method involves creating a pipeline where the model generates instructions, input, and output samples, filters out invalid or similar ones, and then uses them to fine-tune the original model. Applying this method to GPT3 results in a 33% improvement in performance on the SUPER-NATURALINSTRUCTIONS benchmark, comparable to InstructGPT 001, which was trained with private data. The framework also generates a large synthetic dataset of 52K instructions and a set of novel tasks for future research. The method is semi-automated and relies on bootstrapping to generate diverse and creative tasks without extensive human labeling. The generated data is used to fine-tune GPT3, resulting in a model (GPT3_SELF-INST) that outperforms other models on both standard NLP tasks and novel instruction-following tasks. Human evaluations show that GPT3_SELF-INST performs closely to InstructGPT 001, with only a 5% gap. The study highlights the importance of diverse instruction data and demonstrates the effectiveness of SELF-INSTRUCT in improving instruction-following capabilities. The method provides an almost annotation-free way to align pretrained language models with instructions and is released as a large synthetic dataset for further research.SELF-INSTRUCT is a framework that improves the instruction-following capabilities of pretrained language models by generating instructions from the model itself. The method involves creating a pipeline where the model generates instructions, input, and output samples, filters out invalid or similar ones, and then uses them to fine-tune the original model. Applying this method to GPT3 results in a 33% improvement in performance on the SUPER-NATURALINSTRUCTIONS benchmark, comparable to InstructGPT 001, which was trained with private data. The framework also generates a large synthetic dataset of 52K instructions and a set of novel tasks for future research. The method is semi-automated and relies on bootstrapping to generate diverse and creative tasks without extensive human labeling. The generated data is used to fine-tune GPT3, resulting in a model (GPT3_SELF-INST) that outperforms other models on both standard NLP tasks and novel instruction-following tasks. Human evaluations show that GPT3_SELF-INST performs closely to InstructGPT 001, with only a 5% gap. The study highlights the importance of diverse instruction data and demonstrates the effectiveness of SELF-INSTRUCT in improving instruction-following capabilities. The method provides an almost annotation-free way to align pretrained language models with instructions and is released as a large synthetic dataset for further research.