Improving Attributed Text Generation of Large Language Models via Preference Learning

Improving Attributed Text Generation of Large Language Models via Preference Learning

27 Mar 2024 | Dongfang Li, Zetian Sun, Baotian Hu, Zhenyu Liu, Xinshuo Hu, Xuebo Liu, Min Zhang
This paper introduces the APO framework for improving attributed text generation in large language models (LLMs). The authors address the challenge of generating reliable content by modeling the attribution task as preference learning and proposing an Automatic Preference Optimization (APO) framework. The APO framework consists of a post-training procedure to ground the base model for attribution and a preference optimization procedure to address both generation hallucination and attribution hallucination. The post-training procedure involves creating a curated dataset of 6,330 examples from existing datasets to train the model on generating answers with citations. To reduce the cost of labeling preference data, the authors propose an automatic method to synthesize attribution preference data, resulting in 95,263 pairs. Inspired by human citation processes, they further propose a progressive preference optimization method that leverages fine-grained information to improve the model's ability to generate accurate and supported statements. The preference optimization procedure involves generating preference pairs by considering the relevance and support of citations. The authors propose a progressive preference optimization method that uses experience replay to alleviate the over-fitting and text degradation issues caused by distribution shifts in automatically generated data. This method improves the model's ability to generate high-quality responses with accurate citations. The authors evaluate the APO framework on three datasets: ASQA, StrategyQA, and ELI5. The results show that APO achieves state-of-the-art citation F1 scores and improved response quality compared to existing baselines. The APO framework is also shown to be effective in reducing both generation and attribution hallucinations. The authors contribute to the field by being the first to apply preference learning for attribution tasks and by establishing a full data collection pipeline for attribution tasks. They also propose a progressive preference optimization method that addresses the sparse reward problem by leveraging fine-grained information. The APO framework is shown to be effective in improving the quality of generated responses with accurate citations.This paper introduces the APO framework for improving attributed text generation in large language models (LLMs). The authors address the challenge of generating reliable content by modeling the attribution task as preference learning and proposing an Automatic Preference Optimization (APO) framework. The APO framework consists of a post-training procedure to ground the base model for attribution and a preference optimization procedure to address both generation hallucination and attribution hallucination. The post-training procedure involves creating a curated dataset of 6,330 examples from existing datasets to train the model on generating answers with citations. To reduce the cost of labeling preference data, the authors propose an automatic method to synthesize attribution preference data, resulting in 95,263 pairs. Inspired by human citation processes, they further propose a progressive preference optimization method that leverages fine-grained information to improve the model's ability to generate accurate and supported statements. The preference optimization procedure involves generating preference pairs by considering the relevance and support of citations. The authors propose a progressive preference optimization method that uses experience replay to alleviate the over-fitting and text degradation issues caused by distribution shifts in automatically generated data. This method improves the model's ability to generate high-quality responses with accurate citations. The authors evaluate the APO framework on three datasets: ASQA, StrategyQA, and ELI5. The results show that APO achieves state-of-the-art citation F1 scores and improved response quality compared to existing baselines. The APO framework is also shown to be effective in reducing both generation and attribution hallucinations. The authors contribute to the field by being the first to apply preference learning for attribution tasks and by establishing a full data collection pipeline for attribution tasks. They also propose a progressive preference optimization method that addresses the sparse reward problem by leveraging fine-grained information. The APO framework is shown to be effective in improving the quality of generated responses with accurate citations.
Reach us at info@study.space