20 Mar 2021 | Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, Jie Tang*
This paper provides a comprehensive review of self-supervised learning (SSL) methods, focusing on their applications in computer vision, natural language processing, and graph learning. SSL leverages input data itself as supervision, aiming to improve representation learning with fewer labels and better generalization. The survey categorizes SSL methods into three main categories: generative, contrastive, and generative-contrastive (adversarial). Each category is further detailed with specific models and their pros and cons. The paper also discusses the theoretical underpinnings of SSL, including GANs and information maximization, and identifies open problems and future directions. The motivation behind SSL is explained, highlighting its ability to leverage unlabeled data and its potential to address the limitations of supervised learning, such as heavy reliance on manual labels and vulnerability to attacks. The paper concludes with a discussion on the current state and future prospects of SSL.This paper provides a comprehensive review of self-supervised learning (SSL) methods, focusing on their applications in computer vision, natural language processing, and graph learning. SSL leverages input data itself as supervision, aiming to improve representation learning with fewer labels and better generalization. The survey categorizes SSL methods into three main categories: generative, contrastive, and generative-contrastive (adversarial). Each category is further detailed with specific models and their pros and cons. The paper also discusses the theoretical underpinnings of SSL, including GANs and information maximization, and identifies open problems and future directions. The motivation behind SSL is explained, highlighting its ability to leverage unlabeled data and its potential to address the limitations of supervised learning, such as heavy reliance on manual labels and vulnerability to attacks. The paper concludes with a discussion on the current state and future prospects of SSL.