25 May 2023 | Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
SELF-INSTRUCT is a framework designed to enhance the instruction-following capabilities of pre-trained language models by generating instructions, inputs, and outputs from the model itself. The process involves bootstrapping with a small set of manually written tasks, generating new instructions, and creating input-output instances. These are then filtered and used to fine-tune the original model. Applied to GPT3, SELF-INSTRUCT demonstrates a 33% improvement over the original model on the SUPER-NATURALINSTRUCTIONS benchmark, comparable to InstructGPT, which uses private user data and human annotations. Human evaluations on a set of expert-written instructions for novel tasks show that SELF-INSTRUCT outperforms existing public instruction datasets by a large margin, with only a 5% gap behind InstructGPT. The framework provides a nearly annotation-free method for aligning pre-trained language models with instructions, and the authors release a large synthetic dataset to facilitate future research.SELF-INSTRUCT is a framework designed to enhance the instruction-following capabilities of pre-trained language models by generating instructions, inputs, and outputs from the model itself. The process involves bootstrapping with a small set of manually written tasks, generating new instructions, and creating input-output instances. These are then filtered and used to fine-tune the original model. Applied to GPT3, SELF-INSTRUCT demonstrates a 33% improvement over the original model on the SUPER-NATURALINSTRUCTIONS benchmark, comparable to InstructGPT, which uses private user data and human annotations. Human evaluations on a set of expert-written instructions for novel tasks show that SELF-INSTRUCT outperforms existing public instruction datasets by a large margin, with only a 5% gap behind InstructGPT. The framework provides a nearly annotation-free method for aligning pre-trained language models with instructions, and the authors release a large synthetic dataset to facilitate future research.