Named Entity Recognition with Bidirectional LSTM-CNNs

Named Entity Recognition with Bidirectional LSTM-CNNs

2016 | Jason P.C. Chiu, Eric Nichols
This paper presents a novel neural network architecture for Named Entity Recognition (NER) that combines bidirectional LSTM and CNN to automatically detect word- and character-level features, eliminating the need for most feature engineering. The model uses two publicly available lexicons to encode partial matches and achieves state-of-the-art performance on the CoNLL-2003 and OntoNotes 5.0 datasets. On CoNLL-2003, the model achieves an F1 score of 91.62, surpassing previous results by 2.13 F1 points. On OntoNotes 5.0, it achieves an F1 score of 86.28, outperforming systems that rely on heavy feature engineering, proprietary lexicons, and rich entity linking information. The model uses a hybrid bidirectional LSTM and CNN architecture to process variable-length input and capture long-term dependencies. It extracts character-level features using a CNN and combines them with word-level features from a bidirectional LSTM. The model also incorporates a new lexicon encoding scheme that allows partial matches and outperforms the simpler approach of Collobert et al. (2011b). The model is trained using mini-batch stochastic gradient descent with dropout to reduce overfitting. It achieves state-of-the-art results on both the CoNLL-2003 and OntoNotes 5.0 datasets, demonstrating that neural networks can automatically learn relevant features for NER without extensive feature engineering. The model's performance is evaluated on both datasets, showing that it is competitive on CoNLL-2003 and establishes a new state of the art on OntoNotes 5.0. The model's results suggest that neural networks can learn complex relationships from large amounts of data without the need for extensive feature engineering.This paper presents a novel neural network architecture for Named Entity Recognition (NER) that combines bidirectional LSTM and CNN to automatically detect word- and character-level features, eliminating the need for most feature engineering. The model uses two publicly available lexicons to encode partial matches and achieves state-of-the-art performance on the CoNLL-2003 and OntoNotes 5.0 datasets. On CoNLL-2003, the model achieves an F1 score of 91.62, surpassing previous results by 2.13 F1 points. On OntoNotes 5.0, it achieves an F1 score of 86.28, outperforming systems that rely on heavy feature engineering, proprietary lexicons, and rich entity linking information. The model uses a hybrid bidirectional LSTM and CNN architecture to process variable-length input and capture long-term dependencies. It extracts character-level features using a CNN and combines them with word-level features from a bidirectional LSTM. The model also incorporates a new lexicon encoding scheme that allows partial matches and outperforms the simpler approach of Collobert et al. (2011b). The model is trained using mini-batch stochastic gradient descent with dropout to reduce overfitting. It achieves state-of-the-art results on both the CoNLL-2003 and OntoNotes 5.0 datasets, demonstrating that neural networks can automatically learn relevant features for NER without extensive feature engineering. The model's performance is evaluated on both datasets, showing that it is competitive on CoNLL-2003 and establishes a new state of the art on OntoNotes 5.0. The model's results suggest that neural networks can learn complex relationships from large amounts of data without the need for extensive feature engineering.
Reach us at info@study.space
[slides and audio] Named Entity Recognition with Bidirectional LSTM-CNNs