Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

11 Jun 2024 | Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng
This paper introduces a self-alignment framework called *Self-Alignment for Factuality* to mitigate hallucinations in large language models (LLMs). The framework leverages the LLM's self-evaluation capability to improve factual accuracy. Specifically, it incorporates a self-evaluation component, *SELF-EVAL*, which prompts the LLM to validate the factuality of its own generated responses based on internal knowledge. Additionally, *Self-Knowledge Tuning* (SK-Tuning) enhances the LLM's confidence estimation and calibration by improving its self-evaluation ability. The proposed approach is evaluated on three knowledge-intensive tasks—Multi-Choice Question-Answering (MCQA), short-form open-ended generation, and long-form open-ended generation—using datasets TruthfulQA and BioGEN. Results show that *Self-Alignment for Factuality* significantly improves factual accuracy over LLaMA family models, outperforming existing methods that rely on high-quality human annotations or consistency-based confidence estimation. The paper also discusses the effectiveness of SK-Tuning in enhancing confidence estimation and the potential for further applications in various domains.This paper introduces a self-alignment framework called *Self-Alignment for Factuality* to mitigate hallucinations in large language models (LLMs). The framework leverages the LLM's self-evaluation capability to improve factual accuracy. Specifically, it incorporates a self-evaluation component, *SELF-EVAL*, which prompts the LLM to validate the factuality of its own generated responses based on internal knowledge. Additionally, *Self-Knowledge Tuning* (SK-Tuning) enhances the LLM's confidence estimation and calibration by improving its self-evaluation ability. The proposed approach is evaluated on three knowledge-intensive tasks—Multi-Choice Question-Answering (MCQA), short-form open-ended generation, and long-form open-ended generation—using datasets TruthfulQA and BioGEN. Results show that *Self-Alignment for Factuality* significantly improves factual accuracy over LLaMA family models, outperforming existing methods that rely on high-quality human annotations or consistency-based confidence estimation. The paper also discusses the effectiveness of SK-Tuning in enhancing confidence estimation and the potential for further applications in various domains.
Reach us at info@study.space
[slides and audio] Self-Alignment for Factuality%3A Mitigating Hallucinations in LLMs via Self-Evaluation