A Neural Attention Model for Abstractive Sentence Summarization

A Neural Attention Model for Abstractive Sentence Summarization

3 Sep 2015 | Alexander M. Rush, Sumit Chopra, Jason Weston
This paper presents a neural attention-based model for abstractive sentence summarization. The model is fully data-driven and generates each word of the summary conditioned on the input sentence. It uses a local attention mechanism to align the input with the generated summary, enabling the model to capture the core meaning of the original text. The model is trained end-to-end and can scale to large amounts of training data. It outperforms several strong baselines on the DUC-2004 shared task. The model combines a neural language model with a contextual input encoder. The encoder is inspired by the attention-based encoder from Bahdanau et al. (2014), which learns a latent soft alignment between the input and the summary. The model is trained jointly on the sentence summarization task and incorporates a beam-search decoder as well as additional features to model extractive elements. The model is evaluated on the DUC-2004 shared task and outperforms a machine translation system trained on the same large-scale dataset. It also performs well on the Gigaword dataset for headline generation. The model shows significant performance gains compared to extractive summarization methods and is able to generate abstractive summaries that generalize and paraphrase the input text. The model is trained on a large corpus of article pairs from Gigaword, consisting of around 4 million articles. It uses a log-linear scoring function to estimate the probability of a summary and incorporates additional features to model extractive elements. The model is evaluated on various ROUGE metrics and shows significant improvements over baseline systems. The model is compared to several baselines, including a sentence compression baseline, an information retrieval baseline, and a phrase-based statistical machine translation system. The results show that the attention-based model outperforms these baselines on both the DUC-2004 and Gigaword datasets. The model is able to generate abstractive summaries that are more accurate and concise than extractive summaries.This paper presents a neural attention-based model for abstractive sentence summarization. The model is fully data-driven and generates each word of the summary conditioned on the input sentence. It uses a local attention mechanism to align the input with the generated summary, enabling the model to capture the core meaning of the original text. The model is trained end-to-end and can scale to large amounts of training data. It outperforms several strong baselines on the DUC-2004 shared task. The model combines a neural language model with a contextual input encoder. The encoder is inspired by the attention-based encoder from Bahdanau et al. (2014), which learns a latent soft alignment between the input and the summary. The model is trained jointly on the sentence summarization task and incorporates a beam-search decoder as well as additional features to model extractive elements. The model is evaluated on the DUC-2004 shared task and outperforms a machine translation system trained on the same large-scale dataset. It also performs well on the Gigaword dataset for headline generation. The model shows significant performance gains compared to extractive summarization methods and is able to generate abstractive summaries that generalize and paraphrase the input text. The model is trained on a large corpus of article pairs from Gigaword, consisting of around 4 million articles. It uses a log-linear scoring function to estimate the probability of a summary and incorporates additional features to model extractive elements. The model is evaluated on various ROUGE metrics and shows significant improvements over baseline systems. The model is compared to several baselines, including a sentence compression baseline, an information retrieval baseline, and a phrase-based statistical machine translation system. The results show that the attention-based model outperforms these baselines on both the DUC-2004 and Gigaword datasets. The model is able to generate abstractive summaries that are more accurate and concise than extractive summaries.
Reach us at info@study.space
Understanding A Neural Attention Model for Abstractive Sentence Summarization