Self-training with Noisy Student improves ImageNet classification

Self-training with Noisy Student improves ImageNet classification

19 Jun 2020 | Qizhe Xie* 1, Minh-Thang Luong1, Eduard Hovy2, Quoc V. Le1
Noisy Student Training is a semi-supervised learning method that improves ImageNet classification accuracy and robustness. The method uses a large amount of unlabeled data, including 300 million images from the JFT dataset, to train a student model that is equal or larger than a teacher model. The teacher model is trained on labeled data and generates pseudo labels for the unlabeled data. The student model is then trained on both labeled and pseudo-labeled data, with noise injected during training to improve generalization. The process is iterated, with the student model acting as the teacher in subsequent iterations. The method achieves 88.4% top-1 accuracy on ImageNet, which is 2.0% better than the previous state-of-the-art model that required 3.5 billion weakly labeled Instagram images. It also significantly improves robustness on test sets such as ImageNet-A, where top-1 accuracy increases from 61.0% to 83.7%, and on ImageNet-C, where mean corruption error decreases from 45.7 to 28.3. On ImageNet-P, the mean flip rate decreases from 27.8 to 12.2. The method also improves adversarial robustness, with accuracy increasing from 1.1% to 4.4% under an FGSM attack. The method adds noise to the student model during training, using techniques such as data augmentation, dropout, and stochastic depth. This forces the student to learn more robustly from the pseudo labels. The method also includes data filtering and balancing to ensure the distribution of unlabeled data matches that of the training set. The method is effective for both large and small models, with results showing consistent improvements across different model sizes. The method is compared to other semi-supervised learning approaches, including self-training, consistency training, and pseudo-labeling. It is found to be more effective in improving accuracy and robustness, particularly when using a larger student model and adding noise during training. The method is also shown to be effective in improving adversarial robustness, even without direct optimization for it. The results demonstrate that Noisy Student Training is a powerful approach for improving the accuracy and robustness of ImageNet models.Noisy Student Training is a semi-supervised learning method that improves ImageNet classification accuracy and robustness. The method uses a large amount of unlabeled data, including 300 million images from the JFT dataset, to train a student model that is equal or larger than a teacher model. The teacher model is trained on labeled data and generates pseudo labels for the unlabeled data. The student model is then trained on both labeled and pseudo-labeled data, with noise injected during training to improve generalization. The process is iterated, with the student model acting as the teacher in subsequent iterations. The method achieves 88.4% top-1 accuracy on ImageNet, which is 2.0% better than the previous state-of-the-art model that required 3.5 billion weakly labeled Instagram images. It also significantly improves robustness on test sets such as ImageNet-A, where top-1 accuracy increases from 61.0% to 83.7%, and on ImageNet-C, where mean corruption error decreases from 45.7 to 28.3. On ImageNet-P, the mean flip rate decreases from 27.8 to 12.2. The method also improves adversarial robustness, with accuracy increasing from 1.1% to 4.4% under an FGSM attack. The method adds noise to the student model during training, using techniques such as data augmentation, dropout, and stochastic depth. This forces the student to learn more robustly from the pseudo labels. The method also includes data filtering and balancing to ensure the distribution of unlabeled data matches that of the training set. The method is effective for both large and small models, with results showing consistent improvements across different model sizes. The method is compared to other semi-supervised learning approaches, including self-training, consistency training, and pseudo-labeling. It is found to be more effective in improving accuracy and robustness, particularly when using a larger student model and adding noise during training. The method is also shown to be effective in improving adversarial robustness, even without direct optimization for it. The results demonstrate that Noisy Student Training is a powerful approach for improving the accuracy and robustness of ImageNet models.
Reach us at info@study.space
[slides and audio] Self-Training With Noisy Student Improves ImageNet Classification