1 Oct 2019 | Alex Warstadt, Amanpreet Singh, Samuel R. Bowman
This paper investigates the ability of artificial neural networks (ANNs) to judge the grammatical acceptability of sentences, aiming to test their linguistic competence. The authors introduce the Corpus of Linguistic Acceptability (CoLA), a dataset of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. They train several recurrent neural network models on acceptability classification and find that their models outperform unsupervised models by Lau et al. (2016) on CoLA. Error analysis reveals that both models learn systematic generalizations like subject-verb-object order, but all models perform far below human level on various grammatical constructions.
The paper introduces CoLA as a large-scale corpus of acceptability judgments, available online with source code and a leaderboard. It discusses the role of minimal pairs in acceptability judgments and defines (un)acceptability, noting that not all linguistics examples are suitable for classification. The authors also analyze the impact of supervised training on acceptability classifiers by varying domain and quantity of training data.
Experiments show that neural networks trained on CoLA perform better than unsupervised models but still fall short of human performance. The results indicate that models can acquire knowledge about basic subject-verb-object word order and verbal argument structure but do not show evidence of learning non-local dependencies related to agreement and questions. The paper also discusses the design of CoLA, including training set size and splitting into in-domain and out-of-domain sets.
The results show that models trained on CoLA outperform unsupervised models on certain grammatical constructions but struggle with others, such as agreement and reflexive pronouns. The paper concludes that while ANNs can acquire substantial grammatical knowledge, their linguistic competence is far from rivaling humans. The study contributes to the growing effort to evaluate ANNs' ability to make fine-grained grammatical distinctions and addresses foundational questions in theoretical linguistics. The authors also discuss the Poverty of the Stimulus argument, suggesting that the success of supervised acceptability classifiers does not falsify the argument, as unacceptable examples play no apparent role in child language acquisition. The paper highlights the potential of CoLA as a tool for evaluating linguistic competence and the need for further research into unsupervised methods and restricted training resources.This paper investigates the ability of artificial neural networks (ANNs) to judge the grammatical acceptability of sentences, aiming to test their linguistic competence. The authors introduce the Corpus of Linguistic Acceptability (CoLA), a dataset of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. They train several recurrent neural network models on acceptability classification and find that their models outperform unsupervised models by Lau et al. (2016) on CoLA. Error analysis reveals that both models learn systematic generalizations like subject-verb-object order, but all models perform far below human level on various grammatical constructions.
The paper introduces CoLA as a large-scale corpus of acceptability judgments, available online with source code and a leaderboard. It discusses the role of minimal pairs in acceptability judgments and defines (un)acceptability, noting that not all linguistics examples are suitable for classification. The authors also analyze the impact of supervised training on acceptability classifiers by varying domain and quantity of training data.
Experiments show that neural networks trained on CoLA perform better than unsupervised models but still fall short of human performance. The results indicate that models can acquire knowledge about basic subject-verb-object word order and verbal argument structure but do not show evidence of learning non-local dependencies related to agreement and questions. The paper also discusses the design of CoLA, including training set size and splitting into in-domain and out-of-domain sets.
The results show that models trained on CoLA outperform unsupervised models on certain grammatical constructions but struggle with others, such as agreement and reflexive pronouns. The paper concludes that while ANNs can acquire substantial grammatical knowledge, their linguistic competence is far from rivaling humans. The study contributes to the growing effort to evaluate ANNs' ability to make fine-grained grammatical distinctions and addresses foundational questions in theoretical linguistics. The authors also discuss the Poverty of the Stimulus argument, suggesting that the success of supervised acceptability classifiers does not falsify the argument, as unacceptable examples play no apparent role in child language acquisition. The paper highlights the potential of CoLA as a tool for evaluating linguistic competence and the need for further research into unsupervised methods and restricted training resources.