Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

11 Jun 2024 | Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng
This paper introduces Self-Alignment for Factuality, a framework that leverages an LLM's self-evaluation ability to mitigate hallucinations without requiring external knowledge or human intervention. The approach involves using SELF-EVAL to elicit factuality confidence scores from the LLM's generated responses, which are then used as training signals to steer the model toward enhanced factuality. To further improve the LLM's self-evaluation capabilities, SK-TUNING is introduced to enhance the model's confidence estimation and calibration. The framework is evaluated on three key knowledge-intensive tasks: MCQA, short-form open-ended generation, and long-form open-ended generation, using the TruthfulQA and BioGEN datasets. Results show that the proposed self-alignment approach significantly improves factual accuracy across all tasks, outperforming existing methods such as representation-editing and consistency-based confidence approaches. The framework demonstrates effectiveness in aligning LLMs toward factuality by utilizing self-evaluation and SK-TUNING to enhance confidence estimation and calibration. The study also highlights the importance of self-evaluation in improving factuality and the potential of the framework for broader applications in various domains. The paper concludes that the proposed self-alignment approach offers a promising starting point for investigating LLMs' factuality self-alignment.This paper introduces Self-Alignment for Factuality, a framework that leverages an LLM's self-evaluation ability to mitigate hallucinations without requiring external knowledge or human intervention. The approach involves using SELF-EVAL to elicit factuality confidence scores from the LLM's generated responses, which are then used as training signals to steer the model toward enhanced factuality. To further improve the LLM's self-evaluation capabilities, SK-TUNING is introduced to enhance the model's confidence estimation and calibration. The framework is evaluated on three key knowledge-intensive tasks: MCQA, short-form open-ended generation, and long-form open-ended generation, using the TruthfulQA and BioGEN datasets. Results show that the proposed self-alignment approach significantly improves factual accuracy across all tasks, outperforming existing methods such as representation-editing and consistency-based confidence approaches. The framework demonstrates effectiveness in aligning LLMs toward factuality by utilizing self-evaluation and SK-TUNING to enhance confidence estimation and calibration. The study also highlights the importance of self-evaluation in improving factuality and the potential of the framework for broader applications in various domains. The paper concludes that the proposed self-alignment approach offers a promising starting point for investigating LLMs' factuality self-alignment.
Reach us at info@study.space