[slides] Pride and Prejudice%3A LLM Amplifies Self-Bias in Self-Refinement

This paper investigates the self-bias in large language models (LLMs) during their self-refinement process, which is the tendency of LLMs to favor their own generated outputs. The authors define two statistics to quantify this bias: bias and distance skewness. They analyze six LLMs (GPT-4, GPT-3.5, Gemini, LLaMA2, Mixtral, and DeepSeek) on tasks such as translation, constrained text generation, and mathematical reasoning. The results show that self-bias is prevalent across multiple languages and tasks, and it amplifies during self-refinement. While self-refinement improves fluency and understandability, it often leads to false positive corrections and reduced diversity in text generation. To mitigate self-bias, the authors propose two solutions: increasing model size and incorporating external feedback with accurate assessments. Larger models and external feedback significantly reduce bias and improve performance in downstream tasks. The study provides insights into the limitations and potential improvements of LLMs in self-refinement processes.This paper investigates the self-bias in large language models (LLMs) during their self-refinement process, which is the tendency of LLMs to favor their own generated outputs. The authors define two statistics to quantify this bias: bias and distance skewness. They analyze six LLMs (GPT-4, GPT-3.5, Gemini, LLaMA2, Mixtral, and DeepSeek) on tasks such as translation, constrained text generation, and mathematical reasoning. The results show that self-bias is prevalent across multiple languages and tasks, and it amplifies during self-refinement. While self-refinement improves fluency and understandability, it often leads to false positive corrections and reduced diversity in text generation. To mitigate self-bias, the authors propose two solutions: increasing model size and incorporating external feedback with accurate assessments. Larger models and external feedback significantly reduce bias and improve performance in downstream tasks. The study provides insights into the limitations and potential improvements of LLMs in self-refinement processes.

Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement

18 Jun 2024 | Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William Yang Wang