Graph Out-of-Distribution Generalization via Causal Intervention

Graph Out-of-Distribution Generalization via Causal Intervention

May 13-17, 2024 | Qitian Wu, Fan Nie, Chenxiao Yang, Tianyi Bao, Junchi Yan
**Graph Out-of-Distribution Generalization via Causal Intervention** Graph neural networks (GNNs) often fail to generalize well when data distributions shift, as their performance degrades under such conditions. This paper introduces a novel approach, CANET, that addresses this issue by leveraging causal analysis to train robust GNNs for out-of-distribution (OOD) generalization. The key insight is that GNNs' failure in OOD generalization stems from latent confounding bias caused by unobserved environments, which misleads the model to rely on environment-sensitive correlations between ego-graph features and node labels. CANET introduces a new learning objective derived from causal inference, which coordinates an environment estimator and a mixture-of-expert GNN predictor. The environment estimator infers pseudo-environment labels based on input ego-graphs, partitioning nodes into clusters from different distributions. The GNN predictor dynamically selects propagation networks based on these pseudo-environments. This approach effectively mitigates confounding bias and enables the model to learn environment-insensitive predictive relations. Extensive experiments on six graph datasets with various types of distribution shifts demonstrate that CANET significantly improves generalization performance. It achieves up to 27.4% accuracy improvement over state-of-the-art models on graph OOD generalization benchmarks. The method is effective across different types of distribution shifts, including temporal graphs, subgraphs, and dynamic graph snapshots. Ablation studies show that the regularization loss and environment inference are crucial for generalization, while hyperparameter analysis indicates that moderate values of the Gumbel-Softmax temperature (τ) yield the best performance. Visualization studies further confirm that different branches of the mixture-of-expert architecture learn distinct patterns from observed data, enhancing the model's ability to generalize to new environments. The proposed approach provides a principled solution for training robust GNNs under node-level distribution shifts without prior knowledge of environment labels.**Graph Out-of-Distribution Generalization via Causal Intervention** Graph neural networks (GNNs) often fail to generalize well when data distributions shift, as their performance degrades under such conditions. This paper introduces a novel approach, CANET, that addresses this issue by leveraging causal analysis to train robust GNNs for out-of-distribution (OOD) generalization. The key insight is that GNNs' failure in OOD generalization stems from latent confounding bias caused by unobserved environments, which misleads the model to rely on environment-sensitive correlations between ego-graph features and node labels. CANET introduces a new learning objective derived from causal inference, which coordinates an environment estimator and a mixture-of-expert GNN predictor. The environment estimator infers pseudo-environment labels based on input ego-graphs, partitioning nodes into clusters from different distributions. The GNN predictor dynamically selects propagation networks based on these pseudo-environments. This approach effectively mitigates confounding bias and enables the model to learn environment-insensitive predictive relations. Extensive experiments on six graph datasets with various types of distribution shifts demonstrate that CANET significantly improves generalization performance. It achieves up to 27.4% accuracy improvement over state-of-the-art models on graph OOD generalization benchmarks. The method is effective across different types of distribution shifts, including temporal graphs, subgraphs, and dynamic graph snapshots. Ablation studies show that the regularization loss and environment inference are crucial for generalization, while hyperparameter analysis indicates that moderate values of the Gumbel-Softmax temperature (τ) yield the best performance. Visualization studies further confirm that different branches of the mixture-of-expert architecture learn distinct patterns from observed data, enhancing the model's ability to generalize to new environments. The proposed approach provides a principled solution for training robust GNNs under node-level distribution shifts without prior knowledge of environment labels.
Reach us at info@study.space