LongRAG is a novel framework that enhances retrieval-augmented generation (RAG) by utilizing long-context large language models (LLMs). Traditional RAG systems use short retrieval units, which can lead to an imbalanced design where the retriever is overburdened and the reader is underutilized. LongRAG addresses this by introducing a "long retriever" and a "long reader." The long retriever processes the entire Wikipedia into 4K-token units, significantly reducing the corpus size from 22M to 600K units. This reduces the retriever's workload and improves retrieval performance, achieving a 71% answer recall@1 on NQ and 72% answer recall@2 on HotpotQA. The long reader then extracts answers from the concatenated retrieval units using a zero-shot approach with existing long-context LLMs, achieving an EM of 62.7% on NQ and 64.3% on HotpotQA, comparable to fully-trained state-of-the-art models. LongRAG's design allows for more comprehensive information retrieval and better performance by leveraging the capabilities of long-context LLMs. The framework also includes ablation studies showing that longer retrieval units improve end performance and reduce the impact of hard negatives. LongRAG demonstrates that modern RAG systems should reconsider retrieval unit granularity to fully exploit the advantages of long-context LLMs. The framework is tested on the NQ and HotpotQA datasets, showing significant improvements in question-answering performance without additional training. Limitations include reliance on long embedding models and the use of black-box LLMs as readers. Future improvements in long embedding models and more general grouping methods could further enhance LongRAG's performance.LongRAG is a novel framework that enhances retrieval-augmented generation (RAG) by utilizing long-context large language models (LLMs). Traditional RAG systems use short retrieval units, which can lead to an imbalanced design where the retriever is overburdened and the reader is underutilized. LongRAG addresses this by introducing a "long retriever" and a "long reader." The long retriever processes the entire Wikipedia into 4K-token units, significantly reducing the corpus size from 22M to 600K units. This reduces the retriever's workload and improves retrieval performance, achieving a 71% answer recall@1 on NQ and 72% answer recall@2 on HotpotQA. The long reader then extracts answers from the concatenated retrieval units using a zero-shot approach with existing long-context LLMs, achieving an EM of 62.7% on NQ and 64.3% on HotpotQA, comparable to fully-trained state-of-the-art models. LongRAG's design allows for more comprehensive information retrieval and better performance by leveraging the capabilities of long-context LLMs. The framework also includes ablation studies showing that longer retrieval units improve end performance and reduce the impact of hard negatives. LongRAG demonstrates that modern RAG systems should reconsider retrieval unit granularity to fully exploit the advantages of long-context LLMs. The framework is tested on the NQ and HotpotQA datasets, showing significant improvements in question-answering performance without additional training. Limitations include reliance on long embedding models and the use of black-box LLMs as readers. Future improvements in long embedding models and more general grouping methods could further enhance LongRAG's performance.