A Theory of Syntactic Recognition for Natural Language

A Theory of Syntactic Recognition for Natural Language

February, 1978 | Mitchell Philip Marcus
# A Theory of Syntactic Recognition for Natural Language by Mitchell Philip Marcus This dissertation investigates the hypothesis that the syntax of natural language can be parsed by a left-to-right deterministic mechanism without facilities for parallelism or backup. This determinism hypothesis, explored in the context of the grammar of English, is shown to lead to a simple mechanism, a grammar interpreter, having the following properties: - Simple rules of grammar can be written for this interpreter which capture the generalizations behind passives, yes/no questions, and other linguistic phenomena, despite the intrinsic difficulty of capturing such generalizations in the framework of a processing model for recognition, as opposed to a competence model. - The structure of the grammar interpreter constrains its operation in such a way that, by and large, grammar rules cannot parse sentences which violate either of two constraints on rules of grammar currently proposed by Chomsky. This result depends in part upon the computational use of the notion of traces and the related notion of Annotated Surface Structure, which derive from Chomsky's work. - The grammar interpreter provides a simple explanation for the difficulty caused by "garden path" sentences, such as "The cotton clothing is made of grows in Mississippi". Rules can be written for this interpreter to resolve local structural ambiguities which might seem to require nondeterministic parsing; however, the power of such rules depends upon a parameter of the mechanism. Most structural ambiguities can be resolved, given an appropriate setting of this parameter, but those which typically cause garden paths cannot. To the extent that these properties, all of which reflect deep properties of natural language, follow from the determinism hypothesis, this paper provides indirect evidence for the truth of this assumption. This paper also demonstrates that the determinism hypothesis necessitates semantic/syntactic interaction to test the comparative semantic goodness of differing structural possibilities for an input. Thesis Supervisor: Jonathan Allen Title: Professor of Electrical Engineering To My Parents # Acknowledgements I would like to express my gratitude to everyone who contributed to this work, and provided support and encouragement to its author: -to Jonathan Allen, my advisor, for much good advice, and especially for his constant focus on what is central and what is not; -to my readers: to Ira Goldstein, for his careful attention during the course of this research, and his careful reading of several iterations of this document, and to Seymour Papert, for helping to narrow the range of the research, and for creating, with Marvin Minsky, an environment of great intellectual intensity and freedom; -to Bill Martin, who will never let me forget how hard the problem really is; -to Bob Moore and Chuck Rieger, for long talks spent trying to talk me out of various bad ideas; -to Mike Genesereth, Gerry Sussman, and Mike Brady, for valuable suggestions at crucial phases of this research; -to Bill Woods and Susumu Kuno, my undergraduate teachers, who taught me how# A Theory of Syntactic Recognition for Natural Language by Mitchell Philip Marcus This dissertation investigates the hypothesis that the syntax of natural language can be parsed by a left-to-right deterministic mechanism without facilities for parallelism or backup. This determinism hypothesis, explored in the context of the grammar of English, is shown to lead to a simple mechanism, a grammar interpreter, having the following properties: - Simple rules of grammar can be written for this interpreter which capture the generalizations behind passives, yes/no questions, and other linguistic phenomena, despite the intrinsic difficulty of capturing such generalizations in the framework of a processing model for recognition, as opposed to a competence model. - The structure of the grammar interpreter constrains its operation in such a way that, by and large, grammar rules cannot parse sentences which violate either of two constraints on rules of grammar currently proposed by Chomsky. This result depends in part upon the computational use of the notion of traces and the related notion of Annotated Surface Structure, which derive from Chomsky's work. - The grammar interpreter provides a simple explanation for the difficulty caused by "garden path" sentences, such as "The cotton clothing is made of grows in Mississippi". Rules can be written for this interpreter to resolve local structural ambiguities which might seem to require nondeterministic parsing; however, the power of such rules depends upon a parameter of the mechanism. Most structural ambiguities can be resolved, given an appropriate setting of this parameter, but those which typically cause garden paths cannot. To the extent that these properties, all of which reflect deep properties of natural language, follow from the determinism hypothesis, this paper provides indirect evidence for the truth of this assumption. This paper also demonstrates that the determinism hypothesis necessitates semantic/syntactic interaction to test the comparative semantic goodness of differing structural possibilities for an input. Thesis Supervisor: Jonathan Allen Title: Professor of Electrical Engineering To My Parents # Acknowledgements I would like to express my gratitude to everyone who contributed to this work, and provided support and encouragement to its author: -to Jonathan Allen, my advisor, for much good advice, and especially for his constant focus on what is central and what is not; -to my readers: to Ira Goldstein, for his careful attention during the course of this research, and his careful reading of several iterations of this document, and to Seymour Papert, for helping to narrow the range of the research, and for creating, with Marvin Minsky, an environment of great intellectual intensity and freedom; -to Bill Martin, who will never let me forget how hard the problem really is; -to Bob Moore and Chuck Rieger, for long talks spent trying to talk me out of various bad ideas; -to Mike Genesereth, Gerry Sussman, and Mike Brady, for valuable suggestions at crucial phases of this research; -to Bill Woods and Susumu Kuno, my undergraduate teachers, who taught me how
Reach us at info@study.space