Understanding LexRank%3A Graph-based Lexical Centrality as Salience in Text Summarization

This paper introduces LexRank, a graph-based method for computing sentence importance in text summarization. LexRank is based on the concept of eigenvector centrality in a graph representation of sentences, where a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix. The method is tested on the Text Summarization (TS) task, where extractive TS relies on sentence salience to identify the most important sentences. Salience is typically defined in terms of the presence of important words or similarity to a centroid pseudo-sentence. LexRank computes sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. The results show that degree-based methods, including LexRank, outperform both centroid-based methods and other systems in most cases. Furthermore, the LexRank with threshold method outperforms other degree-based techniques. The method is also shown to be quite insensitive to noise in the data. The paper presents a detailed analysis of the approach and applies it to a larger dataset. The results show that the method performs well in comparison to other systems and human performance. The paper also discusses related work and concludes that the graph-based representation of relations between natural language constructs provides new ways of information processing with applications to several problems.This paper introduces LexRank, a graph-based method for computing sentence importance in text summarization. LexRank is based on the concept of eigenvector centrality in a graph representation of sentences, where a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix. The method is tested on the Text Summarization (TS) task, where extractive TS relies on sentence salience to identify the most important sentences. Salience is typically defined in terms of the presence of important words or similarity to a centroid pseudo-sentence. LexRank computes sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. The results show that degree-based methods, including LexRank, outperform both centroid-based methods and other systems in most cases. Furthermore, the LexRank with threshold method outperforms other degree-based techniques. The method is also shown to be quite insensitive to noise in the data. The paper presents a detailed analysis of the approach and applies it to a larger dataset. The results show that the method performs well in comparison to other systems and human performance. The paper also discusses related work and concludes that the graph-based representation of relations between natural language constructs provides new ways of information processing with applications to several problems.

LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

2004 | Güneş Erkan, Dragomir R. Radev