Learned in Translation: Contextualized Word Vectors

Learned in Translation: Contextualized Word Vectors

20 Jun 2018 | Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher
This paper introduces a method to contextualize word vectors using a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT). The authors hypothesize that MT data can serve as a valuable resource for transfer learning in natural language processing (NLP), similar to how ImageNet has been used in computer vision. They train an LSTM encoder on MT datasets and use the context vectors (CoVe) produced by this encoder to improve performance on various NLP tasks, including sentiment analysis, question classification, entailment, and question answering. The experiments show that adding CoVe to word and character vectors significantly enhances model performance, sometimes reaching state-of-the-art levels. The paper also discusses the positive correlation between the amount of training data used for MT-LSTM and the performance on downstream tasks, suggesting that higher-quality MT-LSTMs carry more useful information.This paper introduces a method to contextualize word vectors using a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT). The authors hypothesize that MT data can serve as a valuable resource for transfer learning in natural language processing (NLP), similar to how ImageNet has been used in computer vision. They train an LSTM encoder on MT datasets and use the context vectors (CoVe) produced by this encoder to improve performance on various NLP tasks, including sentiment analysis, question classification, entailment, and question answering. The experiments show that adding CoVe to word and character vectors significantly enhances model performance, sometimes reaching state-of-the-art levels. The paper also discusses the positive correlation between the amount of training data used for MT-LSTM and the performance on downstream tasks, suggesting that higher-quality MT-LSTMs carry more useful information.
Reach us at info@study.space
Understanding Learned in Translation%3A Contextualized Word Vectors