27 May 2024 | Chengyu Huang, Zeqiu Wu, Yushi Hu, Wenyu Wang
This paper proposes a training framework using fine-grained rewards to teach large language models (LLMs) to generate text with citations, ensuring the correctness and relevance of their responses. The authors address the issue of hallucination in LLMs by incorporating in-text citations that reference external documents as evidence. Previous approaches to prompting LLMs for citations have shown limited success, especially with smaller models. The proposed method uses fine-grained rewards to guide the LLMs to generate high-quality citations that support their responses.
The framework combines two training algorithms: rejection sampling (RS) and reinforcement learning (RL). RS is used to generate labels for model fine-tuning, while RL is used to optimize the model's performance. The fine-grained rewards are derived from three aspects: information correctness, citation recall, and citation precision. These rewards are applied at the sentence and citation levels to ensure that the generated text is both accurate and well-supported by citations.
The authors conducted extensive experiments on the ALCE benchmark and the EXPERTQA dataset, demonstrating that their approach outperforms conventional methods. On the ALCE benchmark, the model with fine-grained rewards achieved the best performance among the baselines, even surpassing GPT-3.5-turbo. On EXPERTQA, the model showed strong generalization capabilities, producing attributable answers on a dataset requiring intensive domain knowledge.
The study also highlights the effectiveness of fine-grained rewards over holistic rewards, showing that they lead to better performance in terms of citation quality and response correctness. The results indicate that smaller LLMs can be trained to outperform larger models like ChatGPT when using fine-grained rewards. The framework is shown to be effective in improving the accuracy and reliability of LLM-generated text, making it a valuable approach for generating text with citations.This paper proposes a training framework using fine-grained rewards to teach large language models (LLMs) to generate text with citations, ensuring the correctness and relevance of their responses. The authors address the issue of hallucination in LLMs by incorporating in-text citations that reference external documents as evidence. Previous approaches to prompting LLMs for citations have shown limited success, especially with smaller models. The proposed method uses fine-grained rewards to guide the LLMs to generate high-quality citations that support their responses.
The framework combines two training algorithms: rejection sampling (RS) and reinforcement learning (RL). RS is used to generate labels for model fine-tuning, while RL is used to optimize the model's performance. The fine-grained rewards are derived from three aspects: information correctness, citation recall, and citation precision. These rewards are applied at the sentence and citation levels to ensure that the generated text is both accurate and well-supported by citations.
The authors conducted extensive experiments on the ALCE benchmark and the EXPERTQA dataset, demonstrating that their approach outperforms conventional methods. On the ALCE benchmark, the model with fine-grained rewards achieved the best performance among the baselines, even surpassing GPT-3.5-turbo. On EXPERTQA, the model showed strong generalization capabilities, producing attributable answers on a dataset requiring intensive domain knowledge.
The study also highlights the effectiveness of fine-grained rewards over holistic rewards, showing that they lead to better performance in terms of citation quality and response correctness. The results indicate that smaller LLMs can be trained to outperform larger models like ChatGPT when using fine-grained rewards. The framework is shown to be effective in improving the accuracy and reliability of LLM-generated text, making it a valuable approach for generating text with citations.