Understanding Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks

This paper proposes a novel model called Piecewise Convolutional Neural Networks (PCNNs) with multi-instance learning to address two key challenges in distant supervised relation extraction. First, the heuristic alignment of existing knowledge bases to texts can lead to incorrect labels. Second, traditional feature engineering methods can propagate errors from the feature extraction process, leading to poor performance. To tackle the first issue, the paper treats distant supervised relation extraction as a multi-instance problem, where the uncertainty of instance labels is considered. This approach helps alleviate the wrong label problem by allowing the model to learn from multiple instances within a bag, where the bag's label is known but the individual instance labels are not. For the second issue, the paper introduces a convolutional architecture with piecewise max pooling to automatically learn relevant features without the need for complex NLP preprocessing. This method captures structural information between entities by dividing the convolution results into segments based on the positions of the entities and using a piecewise max pooling layer instead of a single max pooling layer. This allows the model to capture more detailed structural information, which is crucial for accurate relation extraction. The proposed method is evaluated on a benchmark dataset and compared with several traditional approaches. The results show that the PCNNs with multi-instance learning outperform existing methods in terms of precision and recall. The model's ability to automatically learn features without manual feature engineering and its effective handling of the wrong label problem make it a promising approach for distant supervised relation extraction. The paper also demonstrates that the piecewise max pooling technique is beneficial for capturing structural information and that incorporating multi-instance learning into a convolutional neural network is effective in addressing the wrong label problem.This paper proposes a novel model called Piecewise Convolutional Neural Networks (PCNNs) with multi-instance learning to address two key challenges in distant supervised relation extraction. First, the heuristic alignment of existing knowledge bases to texts can lead to incorrect labels. Second, traditional feature engineering methods can propagate errors from the feature extraction process, leading to poor performance. To tackle the first issue, the paper treats distant supervised relation extraction as a multi-instance problem, where the uncertainty of instance labels is considered. This approach helps alleviate the wrong label problem by allowing the model to learn from multiple instances within a bag, where the bag's label is known but the individual instance labels are not. For the second issue, the paper introduces a convolutional architecture with piecewise max pooling to automatically learn relevant features without the need for complex NLP preprocessing. This method captures structural information between entities by dividing the convolution results into segments based on the positions of the entities and using a piecewise max pooling layer instead of a single max pooling layer. This allows the model to capture more detailed structural information, which is crucial for accurate relation extraction. The proposed method is evaluated on a benchmark dataset and compared with several traditional approaches. The results show that the PCNNs with multi-instance learning outperform existing methods in terms of precision and recall. The model's ability to automatically learn features without manual feature engineering and its effective handling of the wrong label problem make it a promising approach for distant supervised relation extraction. The paper also demonstrates that the piecewise max pooling technique is beneficial for capturing structural information and that incorporating multi-instance learning into a convolutional neural network is effective in addressing the wrong label problem.

Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks

21 September 2015 | Daojian Zeng, Kang Liu, Yubo Chen and Jun Zhao