This paper analyzes the design challenges and misconceptions in Named Entity Recognition (NER). It discusses issues such as text chunk representation, inference approaches for combining local NER decisions, sources of prior knowledge, and how to use them in NER systems. The authors compare several solutions to these challenges and reach surprising conclusions. They develop an NER system that achieves a 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset.
The paper highlights the importance of external knowledge resources and non-local features in NER. It finds that the BILOU representation of text chunks significantly outperforms the widely adopted BIO. Surprisingly, naive greedy inference performs comparably to beamsearch or Viterbi, while being considerably more computationally efficient. The paper also analyzes several approaches for modeling non-local dependencies and finds that none of them clearly outperforms the others across several datasets. However, these approaches are independent and can be combined to yield better results.
The paper also discusses the use of unlabeled text and gazetteers in NER. It finds that word class models learned on unlabeled text can significantly improve the performance of the system and can be an alternative to the traditional semi-supervised learning paradigm. The authors also show that gazetteers can be used to improve performance, and that combining word class models and non-local features further improves performance.
The paper concludes that NER is a knowledge-intensive task and that knowledge-driven techniques adapt well across several domains. The authors' system significantly outperforms the current state of the art and is available to download under a research license.This paper analyzes the design challenges and misconceptions in Named Entity Recognition (NER). It discusses issues such as text chunk representation, inference approaches for combining local NER decisions, sources of prior knowledge, and how to use them in NER systems. The authors compare several solutions to these challenges and reach surprising conclusions. They develop an NER system that achieves a 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset.
The paper highlights the importance of external knowledge resources and non-local features in NER. It finds that the BILOU representation of text chunks significantly outperforms the widely adopted BIO. Surprisingly, naive greedy inference performs comparably to beamsearch or Viterbi, while being considerably more computationally efficient. The paper also analyzes several approaches for modeling non-local dependencies and finds that none of them clearly outperforms the others across several datasets. However, these approaches are independent and can be combined to yield better results.
The paper also discusses the use of unlabeled text and gazetteers in NER. It finds that word class models learned on unlabeled text can significantly improve the performance of the system and can be an alternative to the traditional semi-supervised learning paradigm. The authors also show that gazetteers can be used to improve performance, and that combining word class models and non-local features further improves performance.
The paper concludes that NER is a knowledge-intensive task and that knowledge-driven techniques adapt well across several domains. The authors' system significantly outperforms the current state of the art and is available to download under a research license.