Understanding RAG and RAU%3A A Survey on Retrieval-Augmented Language Model in Natural Language Processing

This paper provides a comprehensive survey of Retrieval-Augmented Language Models (RALMs), focusing on both Retrieval-Augmented Generation (RAG) and Retrieval-Augmented Understanding (RAU). It addresses the lack of a detailed overview of RALMs, discussing their paradigm, evolution, taxonomy, and applications. The paper highlights the essential components of RALMs, including Retrievers, Language Models, and Augmentations, and explores how their interactions lead to diverse model structures and applications. RALMs are demonstrated to be effective in a wide range of tasks, from translation and dialogue systems to knowledge-intensive applications. The paper also includes several evaluation methods for RALMs, emphasizing the importance of robustness, accuracy, and relevance in their assessment. It acknowledges the limitations of RALMs, particularly in retrieval quality and computational efficiency, and offers directions for future research. The survey concludes with a structured insight into RALMs, their potential, and future development avenues in NLP. A Github Repository containing the surveyed works and resources for further study is provided.This paper provides a comprehensive survey of Retrieval-Augmented Language Models (RALMs), focusing on both Retrieval-Augmented Generation (RAG) and Retrieval-Augmented Understanding (RAU). It addresses the lack of a detailed overview of RALMs, discussing their paradigm, evolution, taxonomy, and applications. The paper highlights the essential components of RALMs, including Retrievers, Language Models, and Augmentations, and explores how their interactions lead to diverse model structures and applications. RALMs are demonstrated to be effective in a wide range of tasks, from translation and dialogue systems to knowledge-intensive applications. The paper also includes several evaluation methods for RALMs, emphasizing the importance of robustness, accuracy, and relevance in their assessment. It acknowledges the limitations of RALMs, particularly in retrieval quality and computational efficiency, and offers directions for future research. The survey concludes with a structured insight into RALMs, their potential, and future development avenues in NLP. A Github Repository containing the surveyed works and resources for further study is provided.

RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing

30 Apr 2024 | Yucheng Hu, Yuxing Lu