REALM: Retrieval-Augmented Language Model Pre-Training

REALM: Retrieval-Augmented Language Model Pre-Training

10 Feb 2020 | Kelvin Guu * 1 Kenton Lee * 1 Zora Tung 1 Panupong Pasupat 1 Ming-Wei Chang 1
The paper introduces REALM (Retrieval-Augmented Language Model Pre-Training), a novel framework that enhances language model pre-training by incorporating a learned textual knowledge retriever. This retriever allows the model to retrieve and attend to documents from a large corpus, such as Wikipedia, during pre-training, fine-tuning, and inference. The key innovation is the unsupervised pre-training of the retriever using masked language modeling, with backpropagation through the retrieval step, which considers millions of documents. The effectiveness of REALM is demonstrated through its performance on Open-domain Question Answering (Open-QA) tasks, where it outperforms state-of-the-art models by 4-16% in absolute accuracy, while also providing interpretability and modularity. The paper discusses the architecture of REALM, including the neural knowledge retriever and the knowledge-augmented encoder, and addresses computational challenges such as efficient document retrieval and asynchronous index refreshes. Experiments on various Open-QA benchmarks show that REALM significantly improves upon existing methods, both in terms of accuracy and interpretability.The paper introduces REALM (Retrieval-Augmented Language Model Pre-Training), a novel framework that enhances language model pre-training by incorporating a learned textual knowledge retriever. This retriever allows the model to retrieve and attend to documents from a large corpus, such as Wikipedia, during pre-training, fine-tuning, and inference. The key innovation is the unsupervised pre-training of the retriever using masked language modeling, with backpropagation through the retrieval step, which considers millions of documents. The effectiveness of REALM is demonstrated through its performance on Open-domain Question Answering (Open-QA) tasks, where it outperforms state-of-the-art models by 4-16% in absolute accuracy, while also providing interpretability and modularity. The paper discusses the architecture of REALM, including the neural knowledge retriever and the knowledge-augmented encoder, and addresses computational challenges such as efficient document retrieval and asynchronous index refreshes. Experiments on various Open-QA benchmarks show that REALM significantly improves upon existing methods, both in terms of accuracy and interpretability.
Reach us at info@study.space
Understanding REALM%3A Retrieval-Augmented Language Model Pre-Training