Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching

Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching

2024 | Yuchen Zhang, Tianle Zhang, Kai Wang, Ziyao Guo, Yuxuan Liang, Xavier Bresson, Wei Jin, Yang You
This paper proposes GEOM, a novel method for lossless graph condensation through expanding window matching. The goal is to reduce the size of large-scale graph datasets while preserving the performance of Graph Neural Networks (GNNs) trained on them. Existing methods often fail to achieve lossless condensation due to biased supervision signals from the original graph during optimization. To address this, the authors investigate the reasons behind the performance gap between small-scale and large-scale graph condensation and propose a curriculum learning strategy to train expert trajectories with diverse supervision signals from the original graph. They then use expanding window matching to transfer information from the expert trajectories to the condensed graph. Additionally, they design a loss function to extract knowledge from the expert trajectories. Theoretical analysis justifies the design, and extensive experiments verify the method's effectiveness across various datasets. GEOM achieves lossless condensation for multiple datasets, including Citeseer, Cora, Ogbn-arxiv, Flickr, and Reddit, with significant improvements in performance compared to existing methods. The method also generalizes well across different GNN architectures, achieving lossless performance in 20 out of 35 cross-architecture experiments. The paper also discusses related work in dataset distillation and graph condensation, highlighting the importance of accurate supervision signals and the challenges of achieving lossless condensation. The authors conclude that their method provides a promising approach for efficient graph condensation with minimal performance loss.This paper proposes GEOM, a novel method for lossless graph condensation through expanding window matching. The goal is to reduce the size of large-scale graph datasets while preserving the performance of Graph Neural Networks (GNNs) trained on them. Existing methods often fail to achieve lossless condensation due to biased supervision signals from the original graph during optimization. To address this, the authors investigate the reasons behind the performance gap between small-scale and large-scale graph condensation and propose a curriculum learning strategy to train expert trajectories with diverse supervision signals from the original graph. They then use expanding window matching to transfer information from the expert trajectories to the condensed graph. Additionally, they design a loss function to extract knowledge from the expert trajectories. Theoretical analysis justifies the design, and extensive experiments verify the method's effectiveness across various datasets. GEOM achieves lossless condensation for multiple datasets, including Citeseer, Cora, Ogbn-arxiv, Flickr, and Reddit, with significant improvements in performance compared to existing methods. The method also generalizes well across different GNN architectures, achieving lossless performance in 20 out of 35 cross-architecture experiments. The paper also discusses related work in dataset distillation and graph condensation, highlighting the importance of accurate supervision signals and the challenges of achieving lossless condensation. The authors conclude that their method provides a promising approach for efficient graph condensation with minimal performance loss.
Reach us at info@study.space