[slides and audio] Retrieval-Augmented Generation for Large Language Models%3A A Survey

The paper "Retrieval-Augmented Generation for Large Language Models: A Survey" by Yunfan Gao et al. provides a comprehensive review of Retrieval-Augmented Generation (RAG) techniques, which enhance large language models (LLMs) by incorporating external knowledge from databases. RAG addresses challenges such as hallucination, outdated knowledge, and opaque reasoning processes in LLMs. The paper outlines the progression of RAG paradigms, including Naive RAG, Advanced RAG, and Modular RAG, and examines the core components: retrieval, generation, and augmentation. - **Naive RAG**: This early approach involves indexing, retrieval, and generation. It faces issues like precision and recall in retrieval, hallucination in generation, and integration challenges in augmentation. - **Advanced RAG**: This stage introduces improvements in retrieval quality through pre-retrieval and post-retrieval strategies, such as sliding window indexing, fine-grained segmentation, and metadata integration. Post-retrieval processes include reranking and context compression. - **Modular RAG**: This advanced paradigm offers greater flexibility with specialized modules like search, RAG-Fusion, memory, routing, and task adaptation. It supports both sequential and integrated end-to-end training, enhancing adaptability and versatility. The paper also discusses the evaluation methods, downstream tasks, datasets, and benchmarks for RAG, and highlights the challenges and future directions in the field. It compares RAG with fine-tuning and prompt engineering, emphasizing the strengths and limitations of each method. Additionally, it explores optimization techniques in retrieval, such as data source selection, indexing optimization, query optimization, and embedding models, as well as methods for context curation and LLM fine-tuning in the generation phase. Finally, it delves into augmentation processes like iterative retrieval, recursive retrieval, and adaptive retrieval, which enhance the robustness and efficiency of RAG systems.The paper "Retrieval-Augmented Generation for Large Language Models: A Survey" by Yunfan Gao et al. provides a comprehensive review of Retrieval-Augmented Generation (RAG) techniques, which enhance large language models (LLMs) by incorporating external knowledge from databases. RAG addresses challenges such as hallucination, outdated knowledge, and opaque reasoning processes in LLMs. The paper outlines the progression of RAG paradigms, including Naive RAG, Advanced RAG, and Modular RAG, and examines the core components: retrieval, generation, and augmentation. - **Naive RAG**: This early approach involves indexing, retrieval, and generation. It faces issues like precision and recall in retrieval, hallucination in generation, and integration challenges in augmentation. - **Advanced RAG**: This stage introduces improvements in retrieval quality through pre-retrieval and post-retrieval strategies, such as sliding window indexing, fine-grained segmentation, and metadata integration. Post-retrieval processes include reranking and context compression. - **Modular RAG**: This advanced paradigm offers greater flexibility with specialized modules like search, RAG-Fusion, memory, routing, and task adaptation. It supports both sequential and integrated end-to-end training, enhancing adaptability and versatility. The paper also discusses the evaluation methods, downstream tasks, datasets, and benchmarks for RAG, and highlights the challenges and future directions in the field. It compares RAG with fine-tuning and prompt engineering, emphasizing the strengths and limitations of each method. Additionally, it explores optimization techniques in retrieval, such as data source selection, indexing optimization, query optimization, and embedding models, as well as methods for context curation and LLM fine-tuning in the generation phase. Finally, it delves into augmentation processes like iterative retrieval, recursive retrieval, and adaptive retrieval, which enhance the robustness and efficiency of RAG systems.

Retrieval-Augmented Generation for Large Language Models: A Survey

27 Mar 2024 | Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, Haofen Wang