[slides and audio] Training Language Models to Generate Text with Citations via Fine-grained Rewards

This paper addresses the issue of hallucination and the lack of credible references in responses generated by Large Language Models (LLMs). To improve the quality and credibility of LLMs, the authors propose a training framework that uses *fine-grained rewards* to teach LLMs to generate highly supportive and relevant citations. The framework is evaluated on Question Answering (QA) datasets from the ALCE benchmark and validated using the EXPERTQA dataset. The results show that the proposed method significantly improves the performance of LLMs, even surpassing that of GPT-3.5-turbo. The paper also includes a systematic analysis of the effectiveness of fine-grained rewards compared to conventional training methods and discusses the generalizability of the trained models. The main contributions of the paper are: 1. **Fine-grained Rewards**: The authors introduce fine-grained rewards to guide LLMs in generating correct and relevant citations. 2. **Training Framework**: They propose a training framework that uses rejection sampling (RS) and reinforcement learning (RL) to optimize the generation of attributable text. 3. **Performance Improvement**: The trained LLMs achieve better performance on various evaluation metrics, including correctness recall, citation recall, and citation precision. 4. **Generalizability**: The models are shown to be effective on a separate dataset, EXPERTQA, which requires domain-specific knowledge. The paper also includes a detailed analysis of the training process, evaluation metrics, and ablation studies to understand the impact of different reward models and retrieval strategies. The results demonstrate that fine-grained rewards significantly enhance the performance of LLMs in generating attributable text, making them more reliable and trustworthy.This paper addresses the issue of hallucination and the lack of credible references in responses generated by Large Language Models (LLMs). To improve the quality and credibility of LLMs, the authors propose a training framework that uses *fine-grained rewards* to teach LLMs to generate highly supportive and relevant citations. The framework is evaluated on Question Answering (QA) datasets from the ALCE benchmark and validated using the EXPERTQA dataset. The results show that the proposed method significantly improves the performance of LLMs, even surpassing that of GPT-3.5-turbo. The paper also includes a systematic analysis of the effectiveness of fine-grained rewards compared to conventional training methods and discusses the generalizability of the trained models. The main contributions of the paper are: 1. **Fine-grained Rewards**: The authors introduce fine-grained rewards to guide LLMs in generating correct and relevant citations. 2. **Training Framework**: They propose a training framework that uses rejection sampling (RS) and reinforcement learning (RL) to optimize the generation of attributable text. 3. **Performance Improvement**: The trained LLMs achieve better performance on various evaluation metrics, including correctness recall, citation recall, and citation precision. 4. **Generalizability**: The models are shown to be effective on a separate dataset, EXPERTQA, which requires domain-specific knowledge. The paper also includes a detailed analysis of the training process, evaluation metrics, and ablation studies to understand the impact of different reward models and retrieval strategies. The results demonstrate that fine-grained rewards significantly enhance the performance of LLMs in generating attributable text, making them more reliable and trustworthy.

Training Language Models to Generate Text with Citations via Fine-grained Rewards

27 May 2024 | Chengyu Huang, Zequiu Wu, Yushi Hu, Wenyaw Wang