Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

Abstractive Sentence Summarization with Attentive Recurrent Neural Networks

June 12-17, 2016 | Sumit Chopra, Michael Auli, Alexander M. Rush
This paper introduces a novel convolutional attention-based conditional recurrent neural network (RNN) for abstractive sentence summarization. The model uses a conditional RNN decoder that generates summaries by focusing on relevant input words through a convolutional attention-based encoder. The encoder computes scores over the input words, which guide the decoder in generating the next word. The model is trained end-to-end on sentence-summary pairs and outperforms previous state-of-the-art methods on the Gigaword corpus and is competitive on the DUC-2004 task. The model is based on an RNN encoder-decoder architecture, with the encoder using convolutional networks to encode input words and incorporate position information. The decoder uses either an Elman RNN or an LSTM to generate the summary. The model is trained to minimize the negative conditional log likelihood of the training data. The encoder computes a context vector that is used by the decoder to generate the summary. The model is evaluated on the Gigaword and DUC-2004 datasets using ROUGE metrics. The results show that the model outperforms the state-of-the-art ABS and ABS+ systems on the Gigaword dataset, and is competitive on the DUC-2004 dataset. The model also performs well compared to a neural machine translation system. The model's performance is attributed to its use of position features and convolutional attention, which allow it to generate more abstractive summaries. The model is also shown to handle difficult cases where the input contains figurative language or complex sentence structures. The paper concludes that the proposed model is a significant improvement over previous methods for abstractive sentence summarization.This paper introduces a novel convolutional attention-based conditional recurrent neural network (RNN) for abstractive sentence summarization. The model uses a conditional RNN decoder that generates summaries by focusing on relevant input words through a convolutional attention-based encoder. The encoder computes scores over the input words, which guide the decoder in generating the next word. The model is trained end-to-end on sentence-summary pairs and outperforms previous state-of-the-art methods on the Gigaword corpus and is competitive on the DUC-2004 task. The model is based on an RNN encoder-decoder architecture, with the encoder using convolutional networks to encode input words and incorporate position information. The decoder uses either an Elman RNN or an LSTM to generate the summary. The model is trained to minimize the negative conditional log likelihood of the training data. The encoder computes a context vector that is used by the decoder to generate the summary. The model is evaluated on the Gigaword and DUC-2004 datasets using ROUGE metrics. The results show that the model outperforms the state-of-the-art ABS and ABS+ systems on the Gigaword dataset, and is competitive on the DUC-2004 dataset. The model also performs well compared to a neural machine translation system. The model's performance is attributed to its use of position features and convolutional attention, which allow it to generate more abstractive summaries. The model is also shown to handle difficult cases where the input contains figurative language or complex sentence structures. The paper concludes that the proposed model is a significant improvement over previous methods for abstractive sentence summarization.
Reach us at info@study.space