End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures

End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures

8 Jun 2016 | Makoto Miwa, Mohit Bansal
This paper presents a novel end-to-end neural model for extracting entities and relations between them using bidirectional sequential and bidirectional tree-structured LSTM-RNNs. The model captures both word sequence and dependency tree substructure information by stacking bidirectional tree-structured LSTM-RNNs on bidirectional sequential LSTM-RNNs. This allows the model to jointly represent entities and relations with shared parameters in a single model. The model encourages entity detection during training and uses entity information in relation extraction via entity pretraining and scheduled sampling. The model improves over the state-of-the-art feature-based model on end-to-end relation extraction, achieving 12.1% and 5.7% relative error reductions in F1-score on ACE2005 and ACE2004, respectively. The model also compares favorably to the state-of-the-art CNN-based model on nominal relation classification (SemEval-2010 Task 8). The paper also presents an extensive ablation analysis of several model components. The model is designed to extract relations between entities on both word sequence and dependency tree structures. The model uses bidirectional sequential (left-to-right and right-to-left) and bidirectional tree-structured (bottom-up and top-down) LSTM-RNNs to jointly capture linear and dependency context for end-to-end extraction of relations between entities. The model first detects entities and then extracts relations between the detected entities using a single incrementally-decoded NN structure, with the NN parameters jointly updated using both entity and relation labels. The model incorporates two enhancements into training: entity pretraining, which pretrains the entity model, and scheduled sampling, which replaces (unreliable) predicted labels with gold labels in a certain probability. These enhancements alleviate the problem of low-performance entity detection in early stages of training, as well as allow entity information to further help downstream relation classification. The model performs better than the state-of-the-art feature-based model on end-to-end relation extraction, achieving 12.1% (ACE2005) and 5.7% (ACE2004) relative error reductions in F1-score. On nominal relation classification (SemEval-2010 Task 8), the model compares favorably to the state-of-the-art CNN-based model in F1-score. The paper also ablates and compares various model components, leading to key findings about the contribution and effectiveness of different RNN structures, input dependency relation structures, different parsing models, external resources, and joint learning settings. The model uses a sequence layer and a dependency layer. The sequence layer represents words in a linear sequence using the representations from the embedding layer. The dependency layer represents a relation between a pair of two target words in the dependency tree and is in charge of relation-specific representations. The model uses bidirectional tree-structured LSTM-RNNs to represent a relation candidate by capturing the dependency structure around the target word pair. The model also uses aThis paper presents a novel end-to-end neural model for extracting entities and relations between them using bidirectional sequential and bidirectional tree-structured LSTM-RNNs. The model captures both word sequence and dependency tree substructure information by stacking bidirectional tree-structured LSTM-RNNs on bidirectional sequential LSTM-RNNs. This allows the model to jointly represent entities and relations with shared parameters in a single model. The model encourages entity detection during training and uses entity information in relation extraction via entity pretraining and scheduled sampling. The model improves over the state-of-the-art feature-based model on end-to-end relation extraction, achieving 12.1% and 5.7% relative error reductions in F1-score on ACE2005 and ACE2004, respectively. The model also compares favorably to the state-of-the-art CNN-based model on nominal relation classification (SemEval-2010 Task 8). The paper also presents an extensive ablation analysis of several model components. The model is designed to extract relations between entities on both word sequence and dependency tree structures. The model uses bidirectional sequential (left-to-right and right-to-left) and bidirectional tree-structured (bottom-up and top-down) LSTM-RNNs to jointly capture linear and dependency context for end-to-end extraction of relations between entities. The model first detects entities and then extracts relations between the detected entities using a single incrementally-decoded NN structure, with the NN parameters jointly updated using both entity and relation labels. The model incorporates two enhancements into training: entity pretraining, which pretrains the entity model, and scheduled sampling, which replaces (unreliable) predicted labels with gold labels in a certain probability. These enhancements alleviate the problem of low-performance entity detection in early stages of training, as well as allow entity information to further help downstream relation classification. The model performs better than the state-of-the-art feature-based model on end-to-end relation extraction, achieving 12.1% (ACE2005) and 5.7% (ACE2004) relative error reductions in F1-score. On nominal relation classification (SemEval-2010 Task 8), the model compares favorably to the state-of-the-art CNN-based model in F1-score. The paper also ablates and compares various model components, leading to key findings about the contribution and effectiveness of different RNN structures, input dependency relation structures, different parsing models, external resources, and joint learning settings. The model uses a sequence layer and a dependency layer. The sequence layer represents words in a linear sequence using the representations from the embedding layer. The dependency layer represents a relation between a pair of two target words in the dependency tree and is in charge of relation-specific representations. The model uses bidirectional tree-structured LSTM-RNNs to represent a relation candidate by capturing the dependency structure around the target word pair. The model also uses a
Reach us at info@study.space