Long Short-Term Memory-Networks for Machine Reading

Long Short-Term Memory-Networks for Machine Reading

20 Sep 2016 | Jianpeng Cheng, Li Dong and Mirella Lapata
This paper addresses the challenge of enhancing sequence-level networks to better handle structured input. The authors propose a machine reading simulator that processes text incrementally from left to right and performs shallow reasoning with memory and attention. The system extends the Long Short-Term Memory (LSTM) architecture by incorporating a memory network, enabling adaptive memory usage during recurrence with neural attention. This allows the model to weakly induce relations among tokens. The model is initially designed for processing a single sequence but can also be integrated with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference tasks show that the proposed model matches or outperforms state-of-the-art methods. The key contributions include the integration of memory and attention mechanisms, which enable the model to reason over shallow structures and handle structured input more effectively.This paper addresses the challenge of enhancing sequence-level networks to better handle structured input. The authors propose a machine reading simulator that processes text incrementally from left to right and performs shallow reasoning with memory and attention. The system extends the Long Short-Term Memory (LSTM) architecture by incorporating a memory network, enabling adaptive memory usage during recurrence with neural attention. This allows the model to weakly induce relations among tokens. The model is initially designed for processing a single sequence but can also be integrated with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference tasks show that the proposed model matches or outperforms state-of-the-art methods. The key contributions include the integration of memory and attention mechanisms, which enable the model to reason over shallow structures and handle structured input more effectively.
Reach us at info@study.space
[slides and audio] Long Short-Term Memory-Networks for Machine Reading