Understanding Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

This paper presents a novel approach to multi-instance learning with overlapping relations, combining a sentence-level extraction model with a corpus-level component for aggregating individual facts. The authors introduce MULTIR, a probabilistic graphical model that handles overlapping relations and produces accurate sentence-level predictions. The model is computationally efficient, reducing inference to weighted set cover, and achieves significant improvements in accuracy at both the aggregate and sentence levels. Experiments using NY Times text and Freebase data show that MULTIR outperforms previous approaches, demonstrating its effectiveness in handling noisy training data and extracting overlapping relations. The paper also discusses the advantages of sentence-level features and the impact of overlapping relations on extraction performance.This paper presents a novel approach to multi-instance learning with overlapping relations, combining a sentence-level extraction model with a corpus-level component for aggregating individual facts. The authors introduce MULTIR, a probabilistic graphical model that handles overlapping relations and produces accurate sentence-level predictions. The model is computationally efficient, reducing inference to weighted set cover, and achieves significant improvements in accuracy at both the aggregate and sentence levels. Experiments using NY Times text and Freebase data show that MULTIR outperforms previous approaches, demonstrating its effectiveness in handling noisy training data and extracting overlapping relations. The paper also discusses the advantages of sentence-level features and the impact of overlapping relations on extraction performance.

Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

June 19-24, 2011 | Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, Daniel S. Weld