The paper introduces REALM (Retrieval-Augmented Language Model Pre-Training), a novel framework that enhances language model pre-training by incorporating a learned textual knowledge retriever. This retriever allows the model to retrieve and attend to documents from a large corpus, such as Wikipedia, during pre-training, fine-tuning, and inference. The key innovation is the unsupervised pre-training of the retriever using masked language modeling, with backpropagation through the retrieval step, which considers millions of documents. The effectiveness of REALM is demonstrated through its performance on Open-domain Question Answering (Open-QA) tasks, where it outperforms state-of-the-art models by 4-16% in absolute accuracy, while also providing interpretability and modularity. The paper discusses the architecture of REALM, including the neural knowledge retriever and the knowledge-augmented encoder, and addresses computational challenges such as efficient document retrieval and asynchronous index refreshes. Experiments on various Open-QA benchmarks show that REALM significantly improves upon existing methods, both in terms of accuracy and interpretability.The paper introduces REALM (Retrieval-Augmented Language Model Pre-Training), a novel framework that enhances language model pre-training by incorporating a learned textual knowledge retriever. This retriever allows the model to retrieve and attend to documents from a large corpus, such as Wikipedia, during pre-training, fine-tuning, and inference. The key innovation is the unsupervised pre-training of the retriever using masked language modeling, with backpropagation through the retrieval step, which considers millions of documents. The effectiveness of REALM is demonstrated through its performance on Open-domain Question Answering (Open-QA) tasks, where it outperforms state-of-the-art models by 4-16% in absolute accuracy, while also providing interpretability and modularity. The paper discusses the architecture of REALM, including the neural knowledge retriever and the knowledge-augmented encoder, and addresses computational challenges such as efficient document retrieval and asynchronous index refreshes. Experiments on various Open-QA benchmarks show that REALM significantly improves upon existing methods, both in terms of accuracy and interpretability.