26 May 2024 | Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao
Graph Retrieval-Augmented Generation (GRAG) is a novel approach that enhances the accuracy and relevance of responses generated by large language models (LLMs) in graph-based contexts. Unlike traditional Retrieval-Augmented Generation (RAG) methods, which focus solely on text-based entity retrieval, GRAG emphasizes the importance of subgraph structures, maintaining awareness of both textual and topological information. The GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. This method efficiently identifies relevant subgraph structures while avoiding the computational infeasibility of exhaustive subgraph searches, which are NP-hard. Additionally, a novel prompting strategy is proposed to achieve lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that GRAG significantly outperforms current state-of-the-art RAG methods, effectively mitigating hallucinations and reducing training costs. The main contributions of this work include formulating the problem of GRAG, proposing an efficient computational framework, and providing a novel prompting method to convert textual subgraphs into hierarchical text descriptions.Graph Retrieval-Augmented Generation (GRAG) is a novel approach that enhances the accuracy and relevance of responses generated by large language models (LLMs) in graph-based contexts. Unlike traditional Retrieval-Augmented Generation (RAG) methods, which focus solely on text-based entity retrieval, GRAG emphasizes the importance of subgraph structures, maintaining awareness of both textual and topological information. The GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. This method efficiently identifies relevant subgraph structures while avoiding the computational infeasibility of exhaustive subgraph searches, which are NP-hard. Additionally, a novel prompting strategy is proposed to achieve lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that GRAG significantly outperforms current state-of-the-art RAG methods, effectively mitigating hallucinations and reducing training costs. The main contributions of this work include formulating the problem of GRAG, proposing an efficient computational framework, and providing a novel prompting method to convert textual subgraphs into hierarchical text descriptions.