Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

10 Sep 2020 | Jean-Bastien Grill^*1 Florian Strub^*1 Florent Altché^*1 Corentin Tallec^*1 Pierre H. Richemond^*1,2 Elena Buchatskaya1 Carl Doersch1 Bernardo Avila Pires1 Zhaohan Daniel Guo1 Mohammad Gheshlaghi Azar1 Bilal Piot1 Koray Kavukcuoglu1 Rémi Munos1 Michal Valko1
**Bootstrap Your Own Latent (BYOL)** is a novel approach to self-supervised image representation learning. Unlike state-of-the-art contrastive methods that rely on negative pairs, BYOL achieves superior performance without them. BYOL uses two neural networks, an *online* network and a *target* network, which interact and learn from each other. The online network is trained to predict the target network's representation of an augmented image, while the target network is updated with a slow-moving average of the online network. This method avoids the need for careful handling of negative pairs and is more robust to changes in image augmentations. BYOL achieves 74.3% top-1 classification accuracy on ImageNet using a linear evaluation with a ResNet-50 architecture and 79.6% with a larger ResNet. It outperforms current state-of-the-art methods on both transfer and semi-supervised benchmarks. The implementation and pre-trained models are available on GitHub.**Bootstrap Your Own Latent (BYOL)** is a novel approach to self-supervised image representation learning. Unlike state-of-the-art contrastive methods that rely on negative pairs, BYOL achieves superior performance without them. BYOL uses two neural networks, an *online* network and a *target* network, which interact and learn from each other. The online network is trained to predict the target network's representation of an augmented image, while the target network is updated with a slow-moving average of the online network. This method avoids the need for careful handling of negative pairs and is more robust to changes in image augmentations. BYOL achieves 74.3% top-1 classification accuracy on ImageNet using a linear evaluation with a ResNet-50 architecture and 79.6% with a larger ResNet. It outperforms current state-of-the-art methods on both transfer and semi-supervised benchmarks. The implementation and pre-trained models are available on GitHub.
Reach us at info@study.space