Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

2016 | Tal Linzen, Emmanuel Dupoux, Yoav Goldberg
This paper investigates the ability of Long Short-Term Memory (LSTM) neural networks to learn syntax-sensitive dependencies, specifically English subject-verb number agreement. The authors explore whether LSTMs can capture such dependencies without explicit structural representations, using a corpus of natural language without syntactic annotations. They conduct experiments under different training objectives, including number prediction, grammaticality judgments, and language modeling. The results show that LSTMs achieve high overall accuracy in strongly supervised settings but make significant errors when sequential and structural information conflict. The language modeling objective alone is insufficient for capturing syntax-sensitive dependencies, suggesting that more direct supervision is needed. The study concludes that while LSTMs can learn a non-trivial amount of grammatical structure with targeted supervision, stronger architectures may be required to further reduce errors.This paper investigates the ability of Long Short-Term Memory (LSTM) neural networks to learn syntax-sensitive dependencies, specifically English subject-verb number agreement. The authors explore whether LSTMs can capture such dependencies without explicit structural representations, using a corpus of natural language without syntactic annotations. They conduct experiments under different training objectives, including number prediction, grammaticality judgments, and language modeling. The results show that LSTMs achieve high overall accuracy in strongly supervised settings but make significant errors when sequential and structural information conflict. The language modeling objective alone is insufficient for capturing syntax-sensitive dependencies, suggesting that more direct supervision is needed. The study concludes that while LSTMs can learn a non-trivial amount of grammatical structure with targeted supervision, stronger architectures may be required to further reduce errors.
Reach us at info@study.space