[slides] Stochastic RAG%3A End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

This paper introduces Stochastic RAG, a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models. Stochastic RAG relaxes the simplifying assumptions of marginalization and document independence, which are commonly made in prior work. By casting the retrieval process as a stochastic sampling without replacement process, Stochastic RAG employs straight-through Gumbel-top-k to provide a differentiable approximation for sampling without replacement, enabling effective end-to-end optimization. Extensive experiments on seven diverse datasets, including open-domain question answering, fact verification, slot-filling for relation extraction, and dialogue systems, demonstrate significant improvements over state-of-the-art methods on six out of seven datasets. The proposed optimization method is applied to FiD-Light, a recent and effective RAG model, and shows substantial advancements in performance. The paper also discusses the robustness of Stochastic RAG to the number of samples used for estimating the retrieval probabilities and highlights its potential for enhancing the diversity of generated outputs.This paper introduces Stochastic RAG, a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models. Stochastic RAG relaxes the simplifying assumptions of marginalization and document independence, which are commonly made in prior work. By casting the retrieval process as a stochastic sampling without replacement process, Stochastic RAG employs straight-through Gumbel-top-k to provide a differentiable approximation for sampling without replacement, enabling effective end-to-end optimization. Extensive experiments on seven diverse datasets, including open-domain question answering, fact verification, slot-filling for relation extraction, and dialogue systems, demonstrate significant improvements over state-of-the-art methods on six out of seven datasets. The proposed optimization method is applied to FiD-Light, a recent and effective RAG model, and shows substantial advancements in performance. The paper also discusses the robustness of Stochastic RAG to the number of samples used for estimating the retrieval probabilities and highlights its potential for enhancing the diversity of generated outputs.

Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

July 14–18, 2024, Washington, DC, USA | Hamed Zamani, Michael Bendersky