August 20-26, 2018 | Alan Akbik, Duncan Blythe, Roland Vollgraf
This paper introduces *contextual string embeddings*, a novel type of word embeddings that leverage the internal states of a trained character language model. These embeddings are designed to capture the contextual meaning of words by considering their surrounding text, making them distinct from traditional word embeddings that are trained without explicit notions of words. The proposed embeddings are trained on large unlabeled corpora and are contextualized, meaning they vary based on the context in which a word appears. The authors evaluate their approach on four classic sequence labeling tasks: named entity recognition (NER), chunking, and part-of-speech (PoS) tagging. They find that their embeddings consistently outperform previous state-of-the-art methods, achieving new state-of-the-art F1-scores on the CONLL03 NER task for both English and German. The paper also discusses the benefits of combining different types of embeddings and provides a detailed experimental setup and results. Additionally, the authors release all code and pre-trained models to the research community to facilitate reproducibility and further exploration of their proposed embeddings.This paper introduces *contextual string embeddings*, a novel type of word embeddings that leverage the internal states of a trained character language model. These embeddings are designed to capture the contextual meaning of words by considering their surrounding text, making them distinct from traditional word embeddings that are trained without explicit notions of words. The proposed embeddings are trained on large unlabeled corpora and are contextualized, meaning they vary based on the context in which a word appears. The authors evaluate their approach on four classic sequence labeling tasks: named entity recognition (NER), chunking, and part-of-speech (PoS) tagging. They find that their embeddings consistently outperform previous state-of-the-art methods, achieving new state-of-the-art F1-scores on the CONLL03 NER task for both English and German. The paper also discusses the benefits of combining different types of embeddings and provides a detailed experimental setup and results. Additionally, the authors release all code and pre-trained models to the research community to facilitate reproducibility and further exploration of their proposed embeddings.