28 Jan 2022 | Adrien Bardes, Jean Ponce, Yann LeCun
VICReg is a self-supervised learning method designed to prevent representation collapse in joint embedding architectures. It introduces three regularization terms: variance, invariance, and covariance. The variance term ensures that each embedding dimension maintains a minimum variance, while the covariance term decorrelates embedding variables to prevent redundancy. Unlike other methods, VICReg does not require techniques such as weight sharing, batch normalization, or stop gradient operations, and achieves state-of-the-art results on various downstream tasks. The method is applicable to a wide range of architectures and input modalities, making it suitable for multi-modal signals. VICReg's effectiveness is demonstrated through experiments on image recognition tasks, including linear classification and semi-supervised evaluation on ImageNet. It also performs well on transfer learning tasks and multi-modal pretraining on the MS-COCO dataset. The method's simplicity and effectiveness in preventing collapse make it a valuable contribution to self-supervised learning.VICReg is a self-supervised learning method designed to prevent representation collapse in joint embedding architectures. It introduces three regularization terms: variance, invariance, and covariance. The variance term ensures that each embedding dimension maintains a minimum variance, while the covariance term decorrelates embedding variables to prevent redundancy. Unlike other methods, VICReg does not require techniques such as weight sharing, batch normalization, or stop gradient operations, and achieves state-of-the-art results on various downstream tasks. The method is applicable to a wide range of architectures and input modalities, making it suitable for multi-modal signals. VICReg's effectiveness is demonstrated through experiments on image recognition tasks, including linear classification and semi-supervised evaluation on ImageNet. It also performs well on transfer learning tasks and multi-modal pretraining on the MS-COCO dataset. The method's simplicity and effectiveness in preventing collapse make it a valuable contribution to self-supervised learning.