19 Jun 2019 | Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, Aram Galstyan
MixHop is a new graph convolutional architecture that enables the learning of neighborhood mixing relationships, including difference operators, by repeatedly mixing feature representations of neighbors at various distances. Unlike existing methods, MixHop does not require additional memory or computational complexity and outperforms on challenging baselines. It also introduces sparsity regularization to visualize how the network prioritizes neighborhood information across different graph datasets. The analysis of the learned architectures reveals that neighborhood mixing varies per dataset.
MixHop is a graph convolutional layer that mixes powers of the adjacency matrix. It allows full linear mixing of neighborhood information at every message passing step. The model's contributions include formalizing Delta Operators and their generalization, Neighborhood Mixing, to analyze the expressiveness of graph convolution models. It also proposes MixHop, a new graph convolutional layer that mixes powers of the adjacency matrix, proving that it can learn a wider class of representations without increasing the memory footprint or computational complexity of previous GCN models.
The model is evaluated on node classification tasks and demonstrates superior performance compared to existing methods. It is shown that higher-order graph convolutions using neighborhood mixing can outperform existing approaches on real semi-supervised learning tasks. The model's ability to learn delta operators is particularly useful in graphs with low homophily, where the correlation between edges and labels is low. The learned architectures for different datasets vary, indicating that different architectures are optimal for different graph datasets. MixHop is shown to achieve state-of-the-art performance on several node classification tasks and is capable of learning unique architectures optimized for each dataset. The model's use of L2 group lasso regularization allows for the learning of a unique architecture that is optimized for each dataset.MixHop is a new graph convolutional architecture that enables the learning of neighborhood mixing relationships, including difference operators, by repeatedly mixing feature representations of neighbors at various distances. Unlike existing methods, MixHop does not require additional memory or computational complexity and outperforms on challenging baselines. It also introduces sparsity regularization to visualize how the network prioritizes neighborhood information across different graph datasets. The analysis of the learned architectures reveals that neighborhood mixing varies per dataset.
MixHop is a graph convolutional layer that mixes powers of the adjacency matrix. It allows full linear mixing of neighborhood information at every message passing step. The model's contributions include formalizing Delta Operators and their generalization, Neighborhood Mixing, to analyze the expressiveness of graph convolution models. It also proposes MixHop, a new graph convolutional layer that mixes powers of the adjacency matrix, proving that it can learn a wider class of representations without increasing the memory footprint or computational complexity of previous GCN models.
The model is evaluated on node classification tasks and demonstrates superior performance compared to existing methods. It is shown that higher-order graph convolutions using neighborhood mixing can outperform existing approaches on real semi-supervised learning tasks. The model's ability to learn delta operators is particularly useful in graphs with low homophily, where the correlation between edges and labels is low. The learned architectures for different datasets vary, indicating that different architectures are optimal for different graph datasets. MixHop is shown to achieve state-of-the-art performance on several node classification tasks and is capable of learning unique architectures optimized for each dataset. The model's use of L2 group lasso regularization allows for the learning of a unique architecture that is optimized for each dataset.