23 Feb 2024 | Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, Aditi Raghunathan
This paper addresses the limitation of autoregressive language models (LLMs) where token embeddings cannot contain information from tokens that appear later in the input. The authors propose a simple approach called "echo embeddings," which involves repeating the input twice and extracting embeddings from the second occurrence. This method allows early tokens to encode information about later tokens, improving the quality of text embeddings. The authors demonstrate that echo embeddings outperform classical embeddings by over 9% on zero-shot tasks and by around 0.7% when fine-tuned. Using a Mistral-7B model, echo embeddings achieve state-of-the-art performance on the Massive Text Embedding Benchmark (MTEB), surpassing prior open-source models that do not leverage synthetic fine-tuning data. The paper also discusses the limitations of last-token pooling and compares the effectiveness of different pooling strategies. Overall, echo embeddings provide a powerful and simple solution to enhance the performance of autoregressive language models in various tasks.This paper addresses the limitation of autoregressive language models (LLMs) where token embeddings cannot contain information from tokens that appear later in the input. The authors propose a simple approach called "echo embeddings," which involves repeating the input twice and extracting embeddings from the second occurrence. This method allows early tokens to encode information about later tokens, improving the quality of text embeddings. The authors demonstrate that echo embeddings outperform classical embeddings by over 9% on zero-shot tasks and by around 0.7% when fine-tuned. Using a Mistral-7B model, echo embeddings achieve state-of-the-art performance on the Massive Text Embedding Benchmark (MTEB), surpassing prior open-source models that do not leverage synthetic fine-tuning data. The paper also discusses the limitations of last-token pooling and compares the effectiveness of different pooling strategies. Overall, echo embeddings provide a powerful and simple solution to enhance the performance of autoregressive language models in various tasks.