Word Embeddings Revisited: Do LLMs Offer Something New?

Word Embeddings Revisited: Do LLMs Offer Something New?

2 Mar 2024 | Matthew Freestone, Shubhra Kanti Karmaker Santu
This paper investigates whether Large Language Models (LLMs) offer something new in terms of word embeddings compared to classical models like Sentence-BERT (SBERT) and Universal Sentence Encoder (USE). The study compares the latent vector semantics of LLM-based and classical word embedding techniques using two main analyses: Word-Pair Similarity and Word Analogy Task Analysis. The results show that LLMs, particularly ADA and PaLM, tend to cluster semantically related words more tightly and achieve higher average accuracy on the Bigger Analogy Test Set (BATs) compared to classical models. Additionally, some LLMs, such as ADA and PaLM, produce word embeddings similar to SBERT, a relatively lighter classical model. The study concludes that while LLMs can capture meaningful semantics and yield high accuracy, SBERT can be an efficient alternative when resources are limited. However, the research is limited by the small number of models tested and the use of cosine similarity as a metric for semantic similarity.This paper investigates whether Large Language Models (LLMs) offer something new in terms of word embeddings compared to classical models like Sentence-BERT (SBERT) and Universal Sentence Encoder (USE). The study compares the latent vector semantics of LLM-based and classical word embedding techniques using two main analyses: Word-Pair Similarity and Word Analogy Task Analysis. The results show that LLMs, particularly ADA and PaLM, tend to cluster semantically related words more tightly and achieve higher average accuracy on the Bigger Analogy Test Set (BATs) compared to classical models. Additionally, some LLMs, such as ADA and PaLM, produce word embeddings similar to SBERT, a relatively lighter classical model. The study concludes that while LLMs can capture meaningful semantics and yield high accuracy, SBERT can be an efficient alternative when resources are limited. However, the research is limited by the small number of models tested and the use of cosine similarity as a metric for semantic similarity.
Reach us at info@study.space
[slides and audio] Revisiting Word Embeddings in the LLM Era