An Efficient Context-Free Parsing Algorithm

An Efficient Context-Free Parsing Algorithm

February, 1970 | Jay Earley
This paper presents an efficient context-free parsing algorithm, which is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. The algorithm has a time bound proportional to $ n^3 $ for general context-free grammars, $ n^2 $ for unambiguous grammars, and linear time for a large class of grammars, including most practical context-free programming language grammars. It is superior to top-down and bottom-up algorithms in empirical comparisons. The algorithm is based on a formal investigation and empirical comparison. It uses a set of states to represent the condition of the recognition process at each point in the scan. Each state represents a production with a dot indicating the current position in the production, a pointer to the position in the input string, and a k-symbol string representing the allowed successor to that instance of the production. The algorithm processes the states in a state set, performing three operations: predictor, completer, and scanner. The predictor adds new states for each alternative of a nonterminal, the completer adds states when a production is completed, and the scanner adds states when a terminal symbol is matched. The algorithm is described in detail, including its implementation and time and space bounds. It is shown to be efficient for general context-free grammars, unambiguous grammars, and linear time grammars. The algorithm is also compared with other parsing algorithms, such as the top-down and bottom-up parsers, and is found to be superior in terms of efficiency. The algorithm is also discussed in terms of its practical use, including its ability to handle extended context-free grammars with Kleene star notation. It is recommended for use in natural language processing systems, compiler writing systems, and extendible languages. The algorithm is also compared with other algorithms, such as Knuth's and Kasami's, and is found to be competitive in terms of efficiency.This paper presents an efficient context-free parsing algorithm, which is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. The algorithm has a time bound proportional to $ n^3 $ for general context-free grammars, $ n^2 $ for unambiguous grammars, and linear time for a large class of grammars, including most practical context-free programming language grammars. It is superior to top-down and bottom-up algorithms in empirical comparisons. The algorithm is based on a formal investigation and empirical comparison. It uses a set of states to represent the condition of the recognition process at each point in the scan. Each state represents a production with a dot indicating the current position in the production, a pointer to the position in the input string, and a k-symbol string representing the allowed successor to that instance of the production. The algorithm processes the states in a state set, performing three operations: predictor, completer, and scanner. The predictor adds new states for each alternative of a nonterminal, the completer adds states when a production is completed, and the scanner adds states when a terminal symbol is matched. The algorithm is described in detail, including its implementation and time and space bounds. It is shown to be efficient for general context-free grammars, unambiguous grammars, and linear time grammars. The algorithm is also compared with other parsing algorithms, such as the top-down and bottom-up parsers, and is found to be superior in terms of efficiency. The algorithm is also discussed in terms of its practical use, including its ability to handle extended context-free grammars with Kleene star notation. It is recommended for use in natural language processing systems, compiler writing systems, and extendible languages. The algorithm is also compared with other algorithms, such as Knuth's and Kasami's, and is found to be competitive in terms of efficiency.
Reach us at info@study.space