End-to-end Neural Coreference Resolution

End-to-end Neural Coreference Resolution

15 Dec 2017 | Kenton Lee†, Luheng He†, Mike Lewis‡, and Luke Zettlemoyer†*
This paper introduces the first end-to-end coreference resolution model that significantly outperforms previous methods without using syntactic parsers or hand-engineered mention detectors. The model directly considers all document spans as potential mentions and learns distributions over possible antecedents for each. It computes span embeddings that combine context-dependent boundary representations with a head-finding attention mechanism. The model is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments show state-of-the-art performance, with a 1.5 F1 gain on the OntoNotes benchmark and a 3.1 F1 gain using a 5-model ensemble. The model is accurate and interpretable, with factors indicating whether absent coreference links are due to low mention scores or low scores from the mention ranking component. The head-finding attention mechanism reveals which mention-internal words contribute most to coreference decisions. The model factors the problem into unary mention scores and pairwise antecedent scores, enabling efficient computation. The model uses a bidirectional LSTM to encode lexical information and an attention mechanism to model head words. It is trained end-to-end with no external resources, achieving high recall of gold mentions. The model outperforms previous systems in all metrics, with a 1.5 F1 improvement for the single model and a 3.1 F1 improvement for the 5-model ensemble. The model's performance is attributed to improved mention scoring and coreference decisions, with ablation studies showing the importance of various components. The model's attention mechanism is effective for coreference decisions, even without explicit supervision of syntactic heads. The model's weaknesses include predicting false positive links and struggling with coreference decisions requiring world knowledge. The model's performance is promising, but further improvements are needed in word or span representations to distinguish between equivalence, entailment, and alternation. The model is trained end-to-end for the first time, achieving state-of-the-art performance on the OntoNotes benchmark.This paper introduces the first end-to-end coreference resolution model that significantly outperforms previous methods without using syntactic parsers or hand-engineered mention detectors. The model directly considers all document spans as potential mentions and learns distributions over possible antecedents for each. It computes span embeddings that combine context-dependent boundary representations with a head-finding attention mechanism. The model is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments show state-of-the-art performance, with a 1.5 F1 gain on the OntoNotes benchmark and a 3.1 F1 gain using a 5-model ensemble. The model is accurate and interpretable, with factors indicating whether absent coreference links are due to low mention scores or low scores from the mention ranking component. The head-finding attention mechanism reveals which mention-internal words contribute most to coreference decisions. The model factors the problem into unary mention scores and pairwise antecedent scores, enabling efficient computation. The model uses a bidirectional LSTM to encode lexical information and an attention mechanism to model head words. It is trained end-to-end with no external resources, achieving high recall of gold mentions. The model outperforms previous systems in all metrics, with a 1.5 F1 improvement for the single model and a 3.1 F1 improvement for the 5-model ensemble. The model's performance is attributed to improved mention scoring and coreference decisions, with ablation studies showing the importance of various components. The model's attention mechanism is effective for coreference decisions, even without explicit supervision of syntactic heads. The model's weaknesses include predicting false positive links and struggling with coreference decisions requiring world knowledge. The model's performance is promising, but further improvements are needed in word or span representations to distinguish between equivalence, entailment, and alternation. The model is trained end-to-end for the first time, achieving state-of-the-art performance on the OntoNotes benchmark.
Reach us at info@study.space
[slides] End-to-end Neural Coreference Resolution | StudySpace