SimCSE: Simple Contrastive Learning of Sentence Embeddings

SimCSE: Simple Contrastive Learning of Sentence Embeddings

18 May 2022 | Tianyu Gao, Xingcheng Yao, Danqi Chen
SimCSE is a simple and effective contrastive learning framework for sentence embeddings, significantly improving state-of-the-art performance on semantic textual similarity (STS) tasks. The framework consists of two main approaches: an unsupervised method and a supervised method. The unsupervised SimCSE predicts the input sentence itself using dropout noise, which acts as minimal data augmentation. This method outperforms various data augmentation techniques and previous supervised methods. The supervised SimCSE leverages natural language inference (NLI) datasets, using entailment pairs as positives and contradiction pairs as hard negatives. Evaluations on seven STS tasks show that SimCSE achieves an average Spearman's correlation of 76.3% for the unsupervised model and 81.6% for the supervised model, representing improvements of 4.2% and 2.2% over previous best results. The paper also discusses the theoretical and empirical benefits of contrastive learning, including improved alignment and uniformity of sentence embeddings, and provides insights into the anisotropy problem in pre-trained language models.SimCSE is a simple and effective contrastive learning framework for sentence embeddings, significantly improving state-of-the-art performance on semantic textual similarity (STS) tasks. The framework consists of two main approaches: an unsupervised method and a supervised method. The unsupervised SimCSE predicts the input sentence itself using dropout noise, which acts as minimal data augmentation. This method outperforms various data augmentation techniques and previous supervised methods. The supervised SimCSE leverages natural language inference (NLI) datasets, using entailment pairs as positives and contradiction pairs as hard negatives. Evaluations on seven STS tasks show that SimCSE achieves an average Spearman's correlation of 76.3% for the unsupervised model and 81.6% for the supervised model, representing improvements of 4.2% and 2.2% over previous best results. The paper also discusses the theoretical and empirical benefits of contrastive learning, including improved alignment and uniformity of sentence embeddings, and provides insights into the anisotropy problem in pre-trained language models.
Reach us at info@study.space
[slides and audio] SimCSE%3A Simple Contrastive Learning of Sentence Embeddings