Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

12 Apr 2021 | Patrick Lewis†‡, Ethan Perez*, Aleksandra Piktus†, Fabio Petroni†, Vladimir Karpukhin†, Naman Goyal†, Heinrich K"uttler†, Mike Lewis†, Wen-tau Yih†, Tim Rockt"aschel†‡, Sebastian Riedel†‡, Douwe Kiela†
Retrieval-Augmented Generation (RAG) is a method that combines pre-trained parametric and non-parametric memory for language generation. RAG models use a pre-trained seq2seq model as parametric memory and a dense vector index of Wikipedia as non-parametric memory, accessed via a pre-trained neural retriever. Two RAG formulations are explored: one that uses the same retrieved passage for all tokens and another that uses different passages per token. RAG models are fine-tuned on various knowledge-intensive NLP tasks and achieve state-of-the-art results on open-domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, RAG models generate more specific, diverse, and factual language than a state-of-the-art parametric-only seq2seq baseline. RAG models also show improved performance in fact verification tasks, achieving results within 4.3% of state-of-the-art pipeline models. The non-parametric memory can be updated as the world changes, demonstrating the model's adaptability. RAG models are effective in both open-domain QA and knowledge-intensive generation tasks, including Jeopardy question generation, where they generate more factual and specific responses than BART. RAG models also show better performance in fact verification tasks, achieving results close to state-of-the-art models. The results highlight the benefits of combining parametric and non-parametric memory for knowledge-intensive tasks, demonstrating the effectiveness of RAG in a wide range of NLP applications.Retrieval-Augmented Generation (RAG) is a method that combines pre-trained parametric and non-parametric memory for language generation. RAG models use a pre-trained seq2seq model as parametric memory and a dense vector index of Wikipedia as non-parametric memory, accessed via a pre-trained neural retriever. Two RAG formulations are explored: one that uses the same retrieved passage for all tokens and another that uses different passages per token. RAG models are fine-tuned on various knowledge-intensive NLP tasks and achieve state-of-the-art results on open-domain QA tasks, outperforming parametric seq2seq models and task-specific retrieve-and-extract architectures. For language generation tasks, RAG models generate more specific, diverse, and factual language than a state-of-the-art parametric-only seq2seq baseline. RAG models also show improved performance in fact verification tasks, achieving results within 4.3% of state-of-the-art pipeline models. The non-parametric memory can be updated as the world changes, demonstrating the model's adaptability. RAG models are effective in both open-domain QA and knowledge-intensive generation tasks, including Jeopardy question generation, where they generate more factual and specific responses than BART. RAG models also show better performance in fact verification tasks, achieving results close to state-of-the-art models. The results highlight the benefits of combining parametric and non-parametric memory for knowledge-intensive tasks, demonstrating the effectiveness of RAG in a wide range of NLP applications.
Reach us at info@study.space
[slides and audio] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks