25 May 2023 | Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark
SELF-REFINE is an innovative approach that enables large language models (LLMs) to iteratively refine their outputs through self-feedback. Inspired by human problem-solving, SELF-REFINE involves generating an initial output, using the same LLM to provide feedback on it, and then refining the output based on this feedback. This process is repeated iteratively until a stopping condition is met. The method does not require additional training data or reinforcement learning and relies solely on the LLM for generation, feedback, and refinement.
The evaluation of SELF-REFINE across seven diverse tasks, including dialog response generation, code optimization, and math reasoning, demonstrates significant improvements over conventional one-step generation methods. On average, SELF-REFINE achieves a 20% absolute improvement in task performance. The approach is effective even with state-of-the-art LLMs like GPT-4, showcasing its potential to enhance the quality of outputs without extensive supervision.
SELF-REFINE's effectiveness is attributed to the specific and actionable feedback it generates, which helps refine outputs more effectively than generic feedback. The method also shows resilience to sub-optimal feedback, making it robust in various scenarios. However, the approach has limitations, such as the need for sufficient few-shot modeling capabilities in the base models and the lack of open-sourced models for evaluation. Despite these limitations, SELF-REFINE offers a promising direction for improving LLM performance through iterative refinement.SELF-REFINE is an innovative approach that enables large language models (LLMs) to iteratively refine their outputs through self-feedback. Inspired by human problem-solving, SELF-REFINE involves generating an initial output, using the same LLM to provide feedback on it, and then refining the output based on this feedback. This process is repeated iteratively until a stopping condition is met. The method does not require additional training data or reinforcement learning and relies solely on the LLM for generation, feedback, and refinement.
The evaluation of SELF-REFINE across seven diverse tasks, including dialog response generation, code optimization, and math reasoning, demonstrates significant improvements over conventional one-step generation methods. On average, SELF-REFINE achieves a 20% absolute improvement in task performance. The approach is effective even with state-of-the-art LLMs like GPT-4, showcasing its potential to enhance the quality of outputs without extensive supervision.
SELF-REFINE's effectiveness is attributed to the specific and actionable feedback it generates, which helps refine outputs more effectively than generic feedback. The method also shows resilience to sub-optimal feedback, making it robust in various scenarios. However, the approach has limitations, such as the need for sufficient few-shot modeling capabilities in the base models and the lack of open-sourced models for evaluation. Despite these limitations, SELF-REFINE offers a promising direction for improving LLM performance through iterative refinement.