Data-Efficient Image Recognition with Contrastive Predictive Coding

Data-Efficient Image Recognition with Contrastive Predictive Coding

1 Jul 2020 | Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
This paper presents a data-efficient image recognition method based on Contrastive Predictive Coding (CPC), which improves the performance of image classification tasks with significantly fewer labeled examples. The authors hypothesize that data-efficient recognition is enabled by representations that make natural signal variability more predictable. They revisit and improve CPC, an unsupervised learning objective for learning such representations, and demonstrate that the resulting features support state-of-the-art linear classification accuracy on the ImageNet dataset. When used as input for non-linear classification with deep neural networks, these representations allow the use of 2-5 times fewer labels than classifiers trained directly on image pixels. The unsupervised representation also substantially improves transfer learning to object detection on the PASCAL VOC dataset, surpassing fully supervised pre-trained ImageNet classifiers. The CPC architecture learns representations by training neural networks to predict the representations of future observations from those of past ones. When applied to images, CPC predicts the representations of patches below a certain position from those above it. These predictions are evaluated using a contrastive loss, which avoids trivial solutions such as representing all patches with a constant vector. The quality of this prediction is evaluated using a contrastive loss that maximizes the mutual information between the context and the predicted features. The authors evaluate the effectiveness of CPC representations in various tasks, including linear classification, efficient classification, and transfer learning. They find that CPC representations outperform other self-supervised learning methods in terms of accuracy and data efficiency. In linear classification, CPC achieves a Top-1 accuracy of 71.5%, surpassing the original CPC model's 48.7%. In efficient classification, the model achieves a Top-5 accuracy of 78.3% with only 1% of the labels, a 34% improvement over purely supervised methods. In transfer learning to object detection on the PASCAL VOC dataset, the model achieves a mAP of 76.6%, surpassing the performance of supervised pre-training. The results show that CPC provides significant gains in data efficiency that were previously unseen from representation learning methods, and rival the performance of more elaborate label propagation algorithms. The authors conclude that CPC is a promising approach for data-efficient image recognition and transfer learning, with potential applications in domains where data is naturally limited, such as medical imaging or robotics.This paper presents a data-efficient image recognition method based on Contrastive Predictive Coding (CPC), which improves the performance of image classification tasks with significantly fewer labeled examples. The authors hypothesize that data-efficient recognition is enabled by representations that make natural signal variability more predictable. They revisit and improve CPC, an unsupervised learning objective for learning such representations, and demonstrate that the resulting features support state-of-the-art linear classification accuracy on the ImageNet dataset. When used as input for non-linear classification with deep neural networks, these representations allow the use of 2-5 times fewer labels than classifiers trained directly on image pixels. The unsupervised representation also substantially improves transfer learning to object detection on the PASCAL VOC dataset, surpassing fully supervised pre-trained ImageNet classifiers. The CPC architecture learns representations by training neural networks to predict the representations of future observations from those of past ones. When applied to images, CPC predicts the representations of patches below a certain position from those above it. These predictions are evaluated using a contrastive loss, which avoids trivial solutions such as representing all patches with a constant vector. The quality of this prediction is evaluated using a contrastive loss that maximizes the mutual information between the context and the predicted features. The authors evaluate the effectiveness of CPC representations in various tasks, including linear classification, efficient classification, and transfer learning. They find that CPC representations outperform other self-supervised learning methods in terms of accuracy and data efficiency. In linear classification, CPC achieves a Top-1 accuracy of 71.5%, surpassing the original CPC model's 48.7%. In efficient classification, the model achieves a Top-5 accuracy of 78.3% with only 1% of the labels, a 34% improvement over purely supervised methods. In transfer learning to object detection on the PASCAL VOC dataset, the model achieves a mAP of 76.6%, surpassing the performance of supervised pre-training. The results show that CPC provides significant gains in data efficiency that were previously unseen from representation learning methods, and rival the performance of more elaborate label propagation algorithms. The authors conclude that CPC is a promising approach for data-efficient image recognition and transfer learning, with potential applications in domains where data is naturally limited, such as medical imaging or robotics.
Reach us at info@study.space