UNSUPERVISED REPRESENTATION LEARNING BY PREDICTING IMAGE ROTATIONS

UNSUPERVISED REPRESENTATION LEARNING BY PREDICTING IMAGE ROTATIONS

21 Mar 2018 | Spyros Gidaris, Praveer Singh, Nikos Komodakis
This paper proposes a self-supervised method for learning image representations by training convolutional neural networks (ConvNets) to recognize the 2D rotation applied to input images. The method uses a set of four discrete geometric transformations—0°, 90°, 180°, and 270° rotations—to define a classification task that forces the ConvNet to learn semantic features useful for visual perception tasks. The key idea is that recognizing image rotations requires understanding object locations, types, and poses, which in turn helps the model learn meaningful features for tasks like object recognition, detection, and segmentation. The method is evaluated on various benchmarks, including CIFAR-10, ImageNet, PASCAL VOC, and Places. The results show that the self-supervised approach achieves state-of-the-art performance, significantly outperforming prior unsupervised methods. For example, on the PASCAL VOC 2007 detection task, the unsupervised pre-trained AlexNet model achieves an mAP of 54.4%, which is only 2.4 points lower than the supervised case. The method also performs well on other tasks such as ImageNet classification, PASCAL classification, and CIFAR-10 classification. The self-supervised approach is simple to implement and has the same computational cost as supervised learning. It does not require special preprocessing to avoid learning trivial features and can be easily adapted to parallel training schemes. The method is also effective in semi-supervised settings, where it outperforms supervised models when the number of labeled examples per class is low. The paper demonstrates that the rotation prediction task provides a powerful supervisory signal for feature learning, leading to significant improvements in unsupervised representation learning. The results show that the self-supervised approach significantly narrows the gap between unsupervised and supervised feature learning. The code and models are available at https://github.com/gidariss/FeatureLearningRotNet.This paper proposes a self-supervised method for learning image representations by training convolutional neural networks (ConvNets) to recognize the 2D rotation applied to input images. The method uses a set of four discrete geometric transformations—0°, 90°, 180°, and 270° rotations—to define a classification task that forces the ConvNet to learn semantic features useful for visual perception tasks. The key idea is that recognizing image rotations requires understanding object locations, types, and poses, which in turn helps the model learn meaningful features for tasks like object recognition, detection, and segmentation. The method is evaluated on various benchmarks, including CIFAR-10, ImageNet, PASCAL VOC, and Places. The results show that the self-supervised approach achieves state-of-the-art performance, significantly outperforming prior unsupervised methods. For example, on the PASCAL VOC 2007 detection task, the unsupervised pre-trained AlexNet model achieves an mAP of 54.4%, which is only 2.4 points lower than the supervised case. The method also performs well on other tasks such as ImageNet classification, PASCAL classification, and CIFAR-10 classification. The self-supervised approach is simple to implement and has the same computational cost as supervised learning. It does not require special preprocessing to avoid learning trivial features and can be easily adapted to parallel training schemes. The method is also effective in semi-supervised settings, where it outperforms supervised models when the number of labeled examples per class is low. The paper demonstrates that the rotation prediction task provides a powerful supervisory signal for feature learning, leading to significant improvements in unsupervised representation learning. The results show that the self-supervised approach significantly narrows the gap between unsupervised and supervised feature learning. The code and models are available at https://github.com/gidariss/FeatureLearningRotNet.
Reach us at info@study.space