Incorporating Copying Mechanism in Sequence-to-Sequence Learning

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

8 Jun 2016 | Jiatao Gu† Zhengdong Lu† Hang Li† Victor O.K. Li†
The paper addresses the issue of copying in sequence-to-sequence (Seq2Seq) learning, where certain segments of the input sequence are selectively replicated in the output sequence. Inspired by human language communication, the authors propose COPYNET, a neural network-based model with an encoder-decoder structure that integrates both regular word generation and the copying mechanism. COPYNET can choose and place appropriate subsequences from the input sequence in the output sequence. The model is trained end-to-end using gradient descent and demonstrates superior performance on various tasks, including text summarization and dialogue systems, compared to traditional RNN-based models. Empirical studies on synthetic and real-world datasets show that COPYNET effectively handles long and consecutive subsequences, outperforms existing methods, and handles out-of-vocabulary words well. The paper also discusses related work and future directions, including extending the copying mechanism to heterogeneous source and target sequences.The paper addresses the issue of copying in sequence-to-sequence (Seq2Seq) learning, where certain segments of the input sequence are selectively replicated in the output sequence. Inspired by human language communication, the authors propose COPYNET, a neural network-based model with an encoder-decoder structure that integrates both regular word generation and the copying mechanism. COPYNET can choose and place appropriate subsequences from the input sequence in the output sequence. The model is trained end-to-end using gradient descent and demonstrates superior performance on various tasks, including text summarization and dialogue systems, compared to traditional RNN-based models. Empirical studies on synthetic and real-world datasets show that COPYNET effectively handles long and consecutive subsequences, outperforms existing methods, and handles out-of-vocabulary words well. The paper also discusses related work and future directions, including extending the copying mechanism to heterogeneous source and target sequences.
Reach us at info@study.space