23 Jul 2024 | Miltiadis Kofinas, Boris Knyazev, Yan Zhang, Yunlu Chen, Gertjan J. Burghouts, Efstratios Gavves, Cees G. M. Snoek, David W. Zhang
This paper introduces a novel approach to processing neural networks using graph neural networks (GNNs) and transformers, which are designed to preserve permutation symmetry in the neural network parameters. The authors propose representing neural networks as computational graphs, where nodes correspond to neurons and edges to connections between neurons. This representation ensures that the model can handle diverse network architectures, including varying numbers of layers, hidden dimensions, non-linearities, and connectivities. The method is applied to various tasks such as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. The effectiveness of the approach is demonstrated through extensive experiments, showing significant improvements over state-of-the-art methods. The paper also discusses the limitations and future directions, highlighting the need for further research to extend the method to handle more complex architectures and tasks.This paper introduces a novel approach to processing neural networks using graph neural networks (GNNs) and transformers, which are designed to preserve permutation symmetry in the neural network parameters. The authors propose representing neural networks as computational graphs, where nodes correspond to neurons and edges to connections between neurons. This representation ensures that the model can handle diverse network architectures, including varying numbers of layers, hidden dimensions, non-linearities, and connectivities. The method is applied to various tasks such as classifying implicit neural representations, generating neural network weights, and predicting generalization errors. The effectiveness of the approach is demonstrated through extensive experiments, showing significant improvements over state-of-the-art methods. The paper also discusses the limitations and future directions, highlighting the need for further research to extend the method to handle more complex architectures and tasks.