28 Dec 2022 | Vijay Prakash Dwivedi, Chaitanya K. Joshi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, Xavier Bresson
This paper introduces an open-source benchmarking framework for Graph Neural Networks (GNNs) that is modular, easy-to-use, and can be leveraged to quickly yet robustly test new GNN ideas and explore insights that direct further research. The framework includes a diverse collection of mathematical and real-world graphs, enables fair model comparison with the same parameter budget, and has an open-source, easy-to-use, and reproducible code infrastructure. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, demonstrating the utility of the framework through its wide usage by the GNN community. The framework has been used to explore new GNN designs and insights, including the introduction of graph positional encoding (PE) based on Laplacian eigenvectors, which has spurred interest in exploring more powerful PE for Transformers and GNNs in a robust experimental setting.
The framework includes 12 graph datasets, which are collected from real-world sources and generated from mathematical models, of medium-scale size suitable for academic research, representative of the three fundamental learning tasks at graph-level, node-level, and edge-level, and from diverse end-application domains. The datasets are appropriate to statistically separate the performance of GNNs on specific graph properties, fulfilling the academic mission to identify first principles.
The benchmarking framework is built upon PyTorch and DGL libraries and has been developed with the following objectives: ease-of-use and modularity, experimental rigour and fairness, and being future-proof and comprehensive for tracking the progress of graph ML tasks and new GNNs. The framework unifies independent components for data pipelines, GNN layers and models, training and evaluation functions, network and hyperparameter configurations, and scripts for reproducibility.
The framework uses two model parameter budgets for fair comparison: 100k for each GNN for all datasets, and 500k for GNNs where the scalability of a model to larger parameters and deeper layers are investigated. The layers and dimensions are selected accordingly to match these budgets.
The framework has enabled researchers to explore new ideas at any stage of the pipeline without setting up everything else. The framework has also been used to validate and quantify the improvement provided by the idea of using Laplacian eigenvectors as node positional encoding, which has led to improved performance on several datasets, including the newly added AQSOL dataset.
The framework has also been used to perform additional studies on different GNN categories and edge representations for link prediction. The framework has been shown to be effective in identifying first principles and steering GNN research. The framework has been used to benchmark GNNs to identify and quantify what types of architectures, first principles or mechanisms are universal, generalizable, and scalable when moving to larger and more challenging datasets. The framework provides a strong paradigm to answer these fundamental questions and has proved to be beneficial for driving progress, identifying essential ideas, and solvingThis paper introduces an open-source benchmarking framework for Graph Neural Networks (GNNs) that is modular, easy-to-use, and can be leveraged to quickly yet robustly test new GNN ideas and explore insights that direct further research. The framework includes a diverse collection of mathematical and real-world graphs, enables fair model comparison with the same parameter budget, and has an open-source, easy-to-use, and reproducible code infrastructure. As of December 2022, the GitHub repository has reached 2,000 stars and 380 forks, demonstrating the utility of the framework through its wide usage by the GNN community. The framework has been used to explore new GNN designs and insights, including the introduction of graph positional encoding (PE) based on Laplacian eigenvectors, which has spurred interest in exploring more powerful PE for Transformers and GNNs in a robust experimental setting.
The framework includes 12 graph datasets, which are collected from real-world sources and generated from mathematical models, of medium-scale size suitable for academic research, representative of the three fundamental learning tasks at graph-level, node-level, and edge-level, and from diverse end-application domains. The datasets are appropriate to statistically separate the performance of GNNs on specific graph properties, fulfilling the academic mission to identify first principles.
The benchmarking framework is built upon PyTorch and DGL libraries and has been developed with the following objectives: ease-of-use and modularity, experimental rigour and fairness, and being future-proof and comprehensive for tracking the progress of graph ML tasks and new GNNs. The framework unifies independent components for data pipelines, GNN layers and models, training and evaluation functions, network and hyperparameter configurations, and scripts for reproducibility.
The framework uses two model parameter budgets for fair comparison: 100k for each GNN for all datasets, and 500k for GNNs where the scalability of a model to larger parameters and deeper layers are investigated. The layers and dimensions are selected accordingly to match these budgets.
The framework has enabled researchers to explore new ideas at any stage of the pipeline without setting up everything else. The framework has also been used to validate and quantify the improvement provided by the idea of using Laplacian eigenvectors as node positional encoding, which has led to improved performance on several datasets, including the newly added AQSOL dataset.
The framework has also been used to perform additional studies on different GNN categories and edge representations for link prediction. The framework has been shown to be effective in identifying first principles and steering GNN research. The framework has been used to benchmark GNNs to identify and quantify what types of architectures, first principles or mechanisms are universal, generalizable, and scalable when moving to larger and more challenging datasets. The framework provides a strong paradigm to answer these fundamental questions and has proved to be beneficial for driving progress, identifying essential ideas, and solving