Improved Generalization of Weight Space Networks via Augmentations

Improved Generalization of Weight Space Networks via Augmentations

2024 | Aviv Shamsian, Aviv Navon, David W. Zhang, Yan Zhang, Ethan Fetaya, Gal Chechik, Haggai Maron
This paper introduces weight space data augmentation techniques to improve generalization in deep weight space (DWS) networks. DWS networks process the weights of other neural networks, with applications in tasks like 3D shape classification and neural field inference. However, DWS models often suffer from overfitting due to limited diversity in training data. The authors propose a MixUp method tailored for weight spaces, which combines multiple weight configurations to enhance generalization. They demonstrate that their method improves performance similarly to having up to 10 times more data in classification tasks and yields 5-10% gains in downstream classification for self-supervised contrastive learning. The paper analyzes the challenges of generalization in DWS, showing that training with multiple neural views (weight configurations) significantly improves performance on unseen objects. They categorize existing weight space augmentation methods and propose new techniques, including weight space-specific augmentations that leverage neural architecture symmetries. The authors also introduce a novel weight space MixUp method that addresses the issue of weight space symmetries, leading to better generalization. Experiments on three INR datasets (FMNIST, CIFAR10, ModelNet40) show that their proposed weight space MixUp methods can enhance the accuracy of DWS models by up to 18%, equivalent to using 10 times more training data. The methods also improve performance in self-supervised learning setups, achieving 5-10% gains in downstream classification. The paper concludes that weight space augmentations are effective in improving generalization and that further research is needed to address the limitations of current approaches.This paper introduces weight space data augmentation techniques to improve generalization in deep weight space (DWS) networks. DWS networks process the weights of other neural networks, with applications in tasks like 3D shape classification and neural field inference. However, DWS models often suffer from overfitting due to limited diversity in training data. The authors propose a MixUp method tailored for weight spaces, which combines multiple weight configurations to enhance generalization. They demonstrate that their method improves performance similarly to having up to 10 times more data in classification tasks and yields 5-10% gains in downstream classification for self-supervised contrastive learning. The paper analyzes the challenges of generalization in DWS, showing that training with multiple neural views (weight configurations) significantly improves performance on unseen objects. They categorize existing weight space augmentation methods and propose new techniques, including weight space-specific augmentations that leverage neural architecture symmetries. The authors also introduce a novel weight space MixUp method that addresses the issue of weight space symmetries, leading to better generalization. Experiments on three INR datasets (FMNIST, CIFAR10, ModelNet40) show that their proposed weight space MixUp methods can enhance the accuracy of DWS models by up to 18%, equivalent to using 10 times more training data. The methods also improve performance in self-supervised learning setups, achieving 5-10% gains in downstream classification. The paper concludes that weight space augmentations are effective in improving generalization and that further research is needed to address the limitations of current approaches.
Reach us at info@study.space
[slides and audio] Improved Generalization of Weight Space Networks via Augmentations