Understanding OAG-Bench%3A A Human-Curated Benchmark for Academic Graph Mining

OAG-Bench is a comprehensive, multi-aspect, and fine-grained human-curated benchmark for academic graph mining based on the Open Academic Graph (OAG). It covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results. The benchmark provides data pre-processing codes, algorithm implementations, and standardized evaluation protocols to facilitate academic graph mining. It also introduces the Open Academic Graph Challenge (OAG-Challenge) to encourage community input and sharing. OAG-Bench aims to serve as a common ground for the community to evaluate and compare algorithms in academic graph mining, accelerating algorithm development in this field. OAG-Bench is accessible at https://www.aminer.cn/data/. The benchmark includes tasks such as academic entity construction, academic graph completion, academic knowledge acquisition, academic trace and prediction, and academic question answering. It also includes datasets for author name disambiguation, scholar profiling, entity tagging, paper recommendation, reviewer recommendation, academic question answering, paper source tracing, and paper influence prediction. The benchmark evaluates the performance of various methods on these tasks, including graph-based methods, LLM-based methods, and traditional machine learning methods. The results show that even advanced LLMs face challenges in certain tasks, such as paper source tracing and scholar profiling. OAG-Bench is designed to provide a comprehensive and high-quality dataset for academic graph mining, enabling researchers to develop advanced algorithms and study the foundation models for academic graph mining.OAG-Bench is a comprehensive, multi-aspect, and fine-grained human-curated benchmark for academic graph mining based on the Open Academic Graph (OAG). It covers 10 tasks, 20 datasets, 70+ baselines, and 120+ experimental results. The benchmark provides data pre-processing codes, algorithm implementations, and standardized evaluation protocols to facilitate academic graph mining. It also introduces the Open Academic Graph Challenge (OAG-Challenge) to encourage community input and sharing. OAG-Bench aims to serve as a common ground for the community to evaluate and compare algorithms in academic graph mining, accelerating algorithm development in this field. OAG-Bench is accessible at https://www.aminer.cn/data/. The benchmark includes tasks such as academic entity construction, academic graph completion, academic knowledge acquisition, academic trace and prediction, and academic question answering. It also includes datasets for author name disambiguation, scholar profiling, entity tagging, paper recommendation, reviewer recommendation, academic question answering, paper source tracing, and paper influence prediction. The benchmark evaluates the performance of various methods on these tasks, including graph-based methods, LLM-based methods, and traditional machine learning methods. The results show that even advanced LLMs face challenges in certain tasks, such as paper source tracing and scholar profiling. OAG-Bench is designed to provide a comprehensive and high-quality dataset for academic graph mining, enabling researchers to develop advanced algorithms and study the foundation models for academic graph mining.

OAG-Bench: A Human-Curated Benchmark for Academic Graph Mining

August 25-29, 2024 | Fanjin Zhang, Shijie Shi, Yifan Zhu, Bo Chen, Yukuo Cen, Jifan Yu, Yelin Chen, Lulu Wang, Qingfei Zhao, Tianyi Han, Yuwei An, Dan Zhang, Weng Lam Tam, Kun Cao, Yunhe Pang, Xinyu Guan, Huihui Yuan, Jian Song, Xiaoyan Li, Yuxiao Dong, Jie Tang