The Chronicles of RAG: The Retriever, The Chunk and The Generator

The Chronicles of RAG: The Retriever, The Chunk and The Generator

2024-01-15 | Paulo Finardi, Leonardo Avila, Rodrigo Castaldoni, Pedro Gengo, Celio Larcher
This paper presents a comprehensive study on Retrieval Augmented Generation (RAG) for Brazilian Portuguese, focusing on the implementation, optimization, and evaluation of RAG systems. The research explores various retrieval techniques, including both sparse and dense retrievers, and evaluates their effectiveness in answering questions based on a dataset derived from the first Harry Potter book. The study also investigates different chunking strategies, such as naive and sentence window, to optimize the integration of retrieved information into the generation process. Additionally, the paper compares the performance of different large language models (LLMs), including GPT-4 and Gemini Pro, in incorporating retrieved information and producing coherent, contextually accurate responses. The research introduces a new metric called the relative maximum score, which quantifies the gap between different RAG approaches and a perfect RAG system. The study demonstrates that using a retriever with high quality can significantly improve the performance of RAG systems, achieving a 35.4% improvement in MRR@10 compared to the baseline. Furthermore, optimizing the input size in the application can enhance performance by 2.4%. The paper also presents a complete architecture of the RAG system with recommendations for implementation and optimization. The study highlights the importance of data quality, retrieval effectiveness, and evaluation metrics in achieving high performance in RAG systems. It emphasizes the need for careful consideration of input size, the position of the answer within the prompt, and the use of appropriate retrieval strategies to ensure accurate and relevant responses. The research concludes that the best RAG configuration achieved a final accuracy of 98.61%, representing a significant improvement over the baseline. The paper also discusses future work, including the expansion of the study to additional datasets and the exploration of techniques related to segmentation and chunk construction.This paper presents a comprehensive study on Retrieval Augmented Generation (RAG) for Brazilian Portuguese, focusing on the implementation, optimization, and evaluation of RAG systems. The research explores various retrieval techniques, including both sparse and dense retrievers, and evaluates their effectiveness in answering questions based on a dataset derived from the first Harry Potter book. The study also investigates different chunking strategies, such as naive and sentence window, to optimize the integration of retrieved information into the generation process. Additionally, the paper compares the performance of different large language models (LLMs), including GPT-4 and Gemini Pro, in incorporating retrieved information and producing coherent, contextually accurate responses. The research introduces a new metric called the relative maximum score, which quantifies the gap between different RAG approaches and a perfect RAG system. The study demonstrates that using a retriever with high quality can significantly improve the performance of RAG systems, achieving a 35.4% improvement in MRR@10 compared to the baseline. Furthermore, optimizing the input size in the application can enhance performance by 2.4%. The paper also presents a complete architecture of the RAG system with recommendations for implementation and optimization. The study highlights the importance of data quality, retrieval effectiveness, and evaluation metrics in achieving high performance in RAG systems. It emphasizes the need for careful consideration of input size, the position of the answer within the prompt, and the use of appropriate retrieval strategies to ensure accurate and relevant responses. The research concludes that the best RAG configuration achieved a final accuracy of 98.61%, representing a significant improvement over the baseline. The paper also discusses future work, including the expansion of the study to additional datasets and the exploration of techniques related to segmentation and chunk construction.
Reach us at info@study.space
[slides and audio] The Chronicles of RAG%3A The Retriever%2C the Chunk and the Generator