2024 | Miltiadis Kofinas, Boris Knyazev, Yan Zhang, Yunlu Chen, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, David W. Zhang
This paper proposes a method to represent neural networks as computational graphs of parameters, enabling the use of graph neural networks (GNNs) and transformers that preserve permutation symmetry. The approach allows a single model to learn from neural graphs with diverse architectures, including those with varying numbers of layers, hidden dimensions, and network connectivities. The method is evaluated on a wide range of tasks, including classification and editing of implicit neural representations, predicting generalization performance, and learning to optimize, consistently outperforming state-of-the-art methods. The key contributions include a novel neural graph representation that captures both the parameters and architecture of neural networks, enabling equivariance to permutation symmetries. The method is applied to both multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs), and is extended to handle heterogeneous architectures. The neural graph representation allows for the incorporation of probe features, which represent neuron activations during a forward pass, and positional embeddings to maintain permutation symmetry. The method is implemented using GNNs and transformers, with experiments showing improved performance on tasks such as INR classification, style editing, and predicting CNN generalization from parameters. The results demonstrate that the proposed method outperforms existing approaches, highlighting the effectiveness of leveraging graph structures for neural network learning.This paper proposes a method to represent neural networks as computational graphs of parameters, enabling the use of graph neural networks (GNNs) and transformers that preserve permutation symmetry. The approach allows a single model to learn from neural graphs with diverse architectures, including those with varying numbers of layers, hidden dimensions, and network connectivities. The method is evaluated on a wide range of tasks, including classification and editing of implicit neural representations, predicting generalization performance, and learning to optimize, consistently outperforming state-of-the-art methods. The key contributions include a novel neural graph representation that captures both the parameters and architecture of neural networks, enabling equivariance to permutation symmetries. The method is applied to both multi-layer perceptrons (MLPs) and convolutional neural networks (CNNs), and is extended to handle heterogeneous architectures. The neural graph representation allows for the incorporation of probe features, which represent neuron activations during a forward pass, and positional embeddings to maintain permutation symmetry. The method is implemented using GNNs and transformers, with experiments showing improved performance on tasks such as INR classification, style editing, and predicting CNN generalization from parameters. The results demonstrate that the proposed method outperforms existing approaches, highlighting the effectiveness of leveraging graph structures for neural network learning.