Understanding Improved Generalization of Weight Space Networks via Augmentations

The paper "Improved Generalization of Weight Space Networks via Augmentations" addresses the issue of overfitting in deep weight spaces (DWS), where neural networks process the weights of other neural networks. The authors analyze the reasons for overfitting and find that the lack of diversity in DWS datasets is a key factor. They propose data augmentation techniques tailored for DWS, including a novel MixUp method adapted for weight spaces. The effectiveness of these methods is demonstrated through experiments on classification tasks and self-supervised contrastive learning. The results show that the proposed augmentation techniques can improve performance by up to 18%, equivalent to using 10 times more training data, and yield substantial gains in downstream classification tasks. The key contributions of the paper include the investigation of overfitting in DWS, the categorization of existing weight space augmentation methods, the introduction of new weight space augmentations, and extensive experiments to evaluate their impact on model generalization and performance in SSL setups.The paper "Improved Generalization of Weight Space Networks via Augmentations" addresses the issue of overfitting in deep weight spaces (DWS), where neural networks process the weights of other neural networks. The authors analyze the reasons for overfitting and find that the lack of diversity in DWS datasets is a key factor. They propose data augmentation techniques tailored for DWS, including a novel MixUp method adapted for weight spaces. The effectiveness of these methods is demonstrated through experiments on classification tasks and self-supervised contrastive learning. The results show that the proposed augmentation techniques can improve performance by up to 18%, equivalent to using 10 times more training data, and yield substantial gains in downstream classification tasks. The key contributions of the paper include the investigation of overfitting in DWS, the categorization of existing weight space augmentation methods, the introduction of new weight space augmentations, and extensive experiments to evaluate their impact on model generalization and performance in SSL setups.

Improved Generalization of Weight Space Networks via Augmentations

2024 | Aviv Shamsian, Aviv Navon, David W. Zhang, Yan Zhang, Ethan Fetaya, Gal Chechik, Haggai Maron