This paper presents a method for relation extraction using dependency tree kernels. The authors extend previous work on tree kernels to estimate the similarity between dependency trees of sentences. They use this kernel within a Support Vector Machine (SVM) to detect and classify relations between entities in the Automatic Content Extraction (ACE) corpus of news articles. They examine the utility of different features such as Wordnet hypernyms, parts of speech, and entity types, and find that the dependency tree kernel achieves a 20% F1 improvement over a "bag-of-words" kernel.
The goal of Information Extraction (IE) is to discover relevant segments of information in a data stream that will be useful for structuring the data. In the case of text, this usually amounts to finding mentions of interesting entities and the relations that join them, transforming a large corpus of unstructured text into a relational database with entries such as those in Table 1.
The authors describe a relation extraction technique based on kernel methods. Kernel methods are non-parametric density estimation techniques that compute a kernel function between data instances, where a kernel function can be thought of as a similarity measure. Given a set of labeled instances, kernel methods determine the label of a novel instance by comparing it to the labeled training instances using this kernel function. Nearest neighbor classification and support-vector machines (SVMs) are two popular examples of kernel methods.
An advantage of kernel methods is that they can search a feature space much larger than could be represented by a feature extraction-based approach. This is possible because the kernel function can explore an implicit feature space when calculating the similarity between two instances.
Working in such a large feature space can lead to over-fitting in many machine learning algorithms. To address this problem, the authors apply SVMs to the task of relation extraction. SVMs find a boundary between instances of different classes such that the distance between the boundary and the nearest instances is maximized. This characteristic, in addition to empirical validation, indicates that SVMs are particularly robust to over-fitting.
The authors are interested in detecting and classifying instances of relations, where a relation is some meaningful connection between two entities. They represent each relation instance as an augmented dependency tree. A dependency tree represents the grammatical dependencies in a sentence; they augment this tree with features for each node. The task of the kernel function is to find these similarities.
They define a tree kernel over dependency trees and incorporate this kernel within an SVM to extract relations from newswire documents. The tree kernel approach consistently outperforms the bag-of-words kernel, suggesting that this highly-structured representation of sentences is more informative for detecting and distinguishing relations.This paper presents a method for relation extraction using dependency tree kernels. The authors extend previous work on tree kernels to estimate the similarity between dependency trees of sentences. They use this kernel within a Support Vector Machine (SVM) to detect and classify relations between entities in the Automatic Content Extraction (ACE) corpus of news articles. They examine the utility of different features such as Wordnet hypernyms, parts of speech, and entity types, and find that the dependency tree kernel achieves a 20% F1 improvement over a "bag-of-words" kernel.
The goal of Information Extraction (IE) is to discover relevant segments of information in a data stream that will be useful for structuring the data. In the case of text, this usually amounts to finding mentions of interesting entities and the relations that join them, transforming a large corpus of unstructured text into a relational database with entries such as those in Table 1.
The authors describe a relation extraction technique based on kernel methods. Kernel methods are non-parametric density estimation techniques that compute a kernel function between data instances, where a kernel function can be thought of as a similarity measure. Given a set of labeled instances, kernel methods determine the label of a novel instance by comparing it to the labeled training instances using this kernel function. Nearest neighbor classification and support-vector machines (SVMs) are two popular examples of kernel methods.
An advantage of kernel methods is that they can search a feature space much larger than could be represented by a feature extraction-based approach. This is possible because the kernel function can explore an implicit feature space when calculating the similarity between two instances.
Working in such a large feature space can lead to over-fitting in many machine learning algorithms. To address this problem, the authors apply SVMs to the task of relation extraction. SVMs find a boundary between instances of different classes such that the distance between the boundary and the nearest instances is maximized. This characteristic, in addition to empirical validation, indicates that SVMs are particularly robust to over-fitting.
The authors are interested in detecting and classifying instances of relations, where a relation is some meaningful connection between two entities. They represent each relation instance as an augmented dependency tree. A dependency tree represents the grammatical dependencies in a sentence; they augment this tree with features for each node. The task of the kernel function is to find these similarities.
They define a tree kernel over dependency trees and incorporate this kernel within an SVM to extract relations from newswire documents. The tree kernel approach consistently outperforms the bag-of-words kernel, suggesting that this highly-structured representation of sentences is more informative for detecting and distinguishing relations.