[slides] Graph Neural Network Explanations are Fragile

The paper explores the robustness of explainable Graph Neural Networks (GNNs) under adversarial attacks, focusing on perturbation-based explainers. It formulates an attack problem where an adversary can perturb a graph structure to ensure correct GNN predictions while significantly altering the explainer's explanation. The attack is designed under a practical threat model with limited knowledge of the explainer, a small perturbation budget, and structural similarity constraints. Two attack methods are proposed: a loss-based attack that uses edge importance derived from loss changes, and a deduction-based attack that simulates the explainer's learning process. Extensive evaluations on various datasets and explainers show that existing GNN explainers are fragile, with perturbations leading to substantial changes in explanations. The paper also discusses the implications for real-world applications and suggests future directions, including designing provably robust GNN explainers.The paper explores the robustness of explainable Graph Neural Networks (GNNs) under adversarial attacks, focusing on perturbation-based explainers. It formulates an attack problem where an adversary can perturb a graph structure to ensure correct GNN predictions while significantly altering the explainer's explanation. The attack is designed under a practical threat model with limited knowledge of the explainer, a small perturbation budget, and structural similarity constraints. Two attack methods are proposed: a loss-based attack that uses edge importance derived from loss changes, and a deduction-based attack that simulates the explainer's learning process. Extensive evaluations on various datasets and explainers show that existing GNN explainers are fragile, with perturbations leading to substantial changes in explanations. The paper also discusses the implications for real-world applications and suggests future directions, including designing provably robust GNN explainers.

Graph Neural Network Explanations are Fragile

2024 | Jiate Li, Meng Pang, Yun Dong, Jinyuan Jia, Binghui Wang