2024 | Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Long Xia, Dawei Yin and Chao Huang
HiGPT is a general large graph model designed to learn from arbitrary heterogeneous graphs without requiring fine-tuning on downstream datasets. It addresses the challenge of generalizing heterogeneous graph models to diverse downstream learning tasks with distribution shifts in node token sets and relation type heterogeneity. The model introduces an in-context heterogeneous graph tokenizer that captures semantic relationships across different graphs, enabling seamless adaptation. It also incorporates a large corpus of heterogeneity-aware graph instructions to help the model understand complex relation heterogeneity and distinguish between various graph tokens. Additionally, the Mixture-of-Thought (MoT) instruction augmentation paradigm is used to mitigate data scarcity by generating diverse and informative instructions. Through comprehensive evaluations, HiGPT demonstrates exceptional performance in terms of generalization, surpassing current leading benchmarks. The model's framework enables it to adapt to various heterogeneous graph learning tasks without the need for extensive fine-tuning, making it highly versatile and effective in handling complex graph structures. The model's ability to learn from a wide range of heterogeneous graphs and adapt to new tasks is demonstrated through experiments on multiple datasets, showing its strong generalization capabilities and effectiveness in both supervised and zero-shot settings. HiGPT's approach leverages the strengths of large language models and heterogeneous graph learning to achieve superior performance in graph-related tasks.HiGPT is a general large graph model designed to learn from arbitrary heterogeneous graphs without requiring fine-tuning on downstream datasets. It addresses the challenge of generalizing heterogeneous graph models to diverse downstream learning tasks with distribution shifts in node token sets and relation type heterogeneity. The model introduces an in-context heterogeneous graph tokenizer that captures semantic relationships across different graphs, enabling seamless adaptation. It also incorporates a large corpus of heterogeneity-aware graph instructions to help the model understand complex relation heterogeneity and distinguish between various graph tokens. Additionally, the Mixture-of-Thought (MoT) instruction augmentation paradigm is used to mitigate data scarcity by generating diverse and informative instructions. Through comprehensive evaluations, HiGPT demonstrates exceptional performance in terms of generalization, surpassing current leading benchmarks. The model's framework enables it to adapt to various heterogeneous graph learning tasks without the need for extensive fine-tuning, making it highly versatile and effective in handling complex graph structures. The model's ability to learn from a wide range of heterogeneous graphs and adapt to new tasks is demonstrated through experiments on multiple datasets, showing its strong generalization capabilities and effectiveness in both supervised and zero-shot settings. HiGPT's approach leverages the strengths of large language models and heterogeneous graph learning to achieve superior performance in graph-related tasks.