26 Oct 2020 | Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton
This paper presents a semi-supervised learning approach that leverages large, self-supervised models to achieve high accuracy on ImageNet with very few labeled examples. The method involves three main steps: unsupervised pretraining of a large ResNet model using SimCLRv2, supervised fine-tuning on a few labeled examples, and distillation using unlabeled examples to refine and transfer task-specific knowledge. The key idea is that larger models benefit more from the task-agnostic use of unlabeled data, especially when labels are scarce. After fine-tuning, the large model can be distilled into a smaller one with little loss in classification accuracy by using unlabeled examples in a task-specific way. This approach achieves 73.9% ImageNet top-1 accuracy with just 1% of the labels, a 10× improvement in label efficiency over the previous state-of-the-art. With 10% of labels, ResNet-50 trained with this method achieves 77.5% top-1 accuracy, outperforming standard supervised training with all of the labels. The method also shows that bigger models are more label-efficient, and that the use of a deeper projection head improves representation learning. Additionally, distillation using unlabeled examples improves semi-supervised learning performance. The proposed method outperforms previous state-of-the-art methods on ImageNet, demonstrating the effectiveness of large self-supervised models in semi-supervised learning.This paper presents a semi-supervised learning approach that leverages large, self-supervised models to achieve high accuracy on ImageNet with very few labeled examples. The method involves three main steps: unsupervised pretraining of a large ResNet model using SimCLRv2, supervised fine-tuning on a few labeled examples, and distillation using unlabeled examples to refine and transfer task-specific knowledge. The key idea is that larger models benefit more from the task-agnostic use of unlabeled data, especially when labels are scarce. After fine-tuning, the large model can be distilled into a smaller one with little loss in classification accuracy by using unlabeled examples in a task-specific way. This approach achieves 73.9% ImageNet top-1 accuracy with just 1% of the labels, a 10× improvement in label efficiency over the previous state-of-the-art. With 10% of labels, ResNet-50 trained with this method achieves 77.5% top-1 accuracy, outperforming standard supervised training with all of the labels. The method also shows that bigger models are more label-efficient, and that the use of a deeper projection head improves representation learning. Additionally, distillation using unlabeled examples improves semi-supervised learning performance. The proposed method outperforms previous state-of-the-art methods on ImageNet, demonstrating the effectiveness of large self-supervised models in semi-supervised learning.