Understanding Sentence-BERT%3A Sentence Embeddings using Siamese BERT-Networks

The paper introduces Sentence-BERT (SBERT), a modification of the BERT network that uses siamese and triplet network structures to derive semantically meaningful sentence embeddings. This approach reduces the computational overhead associated with finding the most similar pair of sentences, which is a significant advantage over BERT and RoBERTa, which require feeding both sentences into the network. SBERT can perform cosine-similarity comparisons in about 5 seconds, compared to 65 hours for BERT. The authors evaluate SBERT on various tasks, including semantic textual similarity (STS) and transfer learning tasks, and find that it outperforms other state-of-the-art sentence embedding methods. SBERT is also adapted to specific tasks, such as argument similarity and distinguishing sentences from different sections of a Wikipedia article, demonstrating its versatility and effectiveness. The paper includes a detailed evaluation of SBERT's performance on STS tasks, argument facet similarity, and Wikipedia section distinction, as well as an ablation study to understand the importance of different components of the model. Additionally, the computational efficiency of SBERT is compared with other methods, showing that it is significantly faster on GPUs.The paper introduces Sentence-BERT (SBERT), a modification of the BERT network that uses siamese and triplet network structures to derive semantically meaningful sentence embeddings. This approach reduces the computational overhead associated with finding the most similar pair of sentences, which is a significant advantage over BERT and RoBERTa, which require feeding both sentences into the network. SBERT can perform cosine-similarity comparisons in about 5 seconds, compared to 65 hours for BERT. The authors evaluate SBERT on various tasks, including semantic textual similarity (STS) and transfer learning tasks, and find that it outperforms other state-of-the-art sentence embedding methods. SBERT is also adapted to specific tasks, such as argument similarity and distinguishing sentences from different sections of a Wikipedia article, demonstrating its versatility and effectiveness. The paper includes a detailed evaluation of SBERT's performance on STS tasks, argument facet similarity, and Wikipedia section distinction, as well as an ablation study to understand the importance of different components of the model. Additionally, the computational efficiency of SBERT is compared with other methods, showing that it is significantly faster on GPUs.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

27 Aug 2019 | Nils Reimers and Iryna Gurevych