7 Jun 2024 | Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michal Podstawski, Torsten Hoeffer
The paper introduces Multi-Head RAG (MRAG), a novel approach to enhance the retrieval capabilities of Large Language Models (LLMs) by leveraging the activations from the multi-head attention layer of decoder models. This method addresses the challenge of retrieving multiple documents with significantly different contents for complex queries, which existing RAG solutions struggle with due to the embedding space distances between these documents. MRAG uses the activations from different attention heads to capture distinct aspects of data, creating embeddings that better represent the multifaceted nature of data items and queries. This approach improves retrieval accuracy for complex, multi-aspect queries by up to 20% compared to standard RAG baselines. The paper also presents an evaluation methodology, synthetic datasets, and real-world use cases to demonstrate MRAG's effectiveness. MRAG can be seamlessly integrated with existing RAG frameworks and benchmarking tools, making it a valuable advancement in the field of LLMs and RAG systems.The paper introduces Multi-Head RAG (MRAG), a novel approach to enhance the retrieval capabilities of Large Language Models (LLMs) by leveraging the activations from the multi-head attention layer of decoder models. This method addresses the challenge of retrieving multiple documents with significantly different contents for complex queries, which existing RAG solutions struggle with due to the embedding space distances between these documents. MRAG uses the activations from different attention heads to capture distinct aspects of data, creating embeddings that better represent the multifaceted nature of data items and queries. This approach improves retrieval accuracy for complex, multi-aspect queries by up to 20% compared to standard RAG baselines. The paper also presents an evaluation methodology, synthetic datasets, and real-world use cases to demonstrate MRAG's effectiveness. MRAG can be seamlessly integrated with existing RAG frameworks and benchmarking tools, making it a valuable advancement in the field of LLMs and RAG systems.