This paper presents a method for generating high-quality n-best parses using a coarse-to-fine generative parser, followed by a MaxEnt discriminative reranking to select the best parse from the n-best list. The method produces 50-best lists with significantly higher quality than previously achievable. These parses are then used as input to a MaxEnt reranker, which achieves an f-score of 91.0% on sentences of length 100 or less.
The reranking parser uses a regularized MaxEnt reranker to select the best parse from the 50-best parses generated by a generative parsing model. The 50-best parser is a probabilistic parser that, on its own, produces high-quality parses. The maximum probability parse trees have an f-score of 0.897 on section 23 of the Penn Treebank, which is state-of-the-art. However, the 50 best parses often contain better parses in terms of f-score, and this paper describes a 50-best parsing algorithm with an oracle f-score of 96.8 on the same data.
The reranker selects the best parse from the 50-best list of possible parses for each sentence. Because the reranker only needs to consider a relatively small number of parses per sentence, it is not necessary to use dynamic programming, which allows features to be arbitrary functions of the parse trees. The reranker does not achieve the oracle f-score, but the parses it selects have an f-score of 91.0, which is considerably better than the maximum probability parses of the n-best parser.
The n-best parsing algorithm generates the n highest probability parses for each string, along with their probabilities. The number of parses is set to 50 for the experiments described here, but some simple sentences receive fewer than 50 parses. Each yield or terminal string in the training, development, and test data sets is mapped to such an n-best list of parse/probability pairs. The cross-validation scheme described in Collins (2000) was used to avoid training the n-best parser on the sentence it was being used to parse.
A feature extractor is a vector of m functions, where each function maps a parse to a real number. The reranking parser associates a parse with a score, which is a linear function of the feature values. The feature weight vector is estimated from the labeled training corpus. The correct parse for each sentence in the training data is known, and the reranker is trained to identify the parses with the highest f-scores.
The paper describes a method for recovering the n-best parses using a coarse-to-fine parsing approach. This approach first produces a crude version of the parse using coarse-grained dynamic programming states, then builds fine-grained analyses by splitting the most promising of coarse-grained states. This method is efficient and effective, as demonstrated by the results on the Penn Treebank.This paper presents a method for generating high-quality n-best parses using a coarse-to-fine generative parser, followed by a MaxEnt discriminative reranking to select the best parse from the n-best list. The method produces 50-best lists with significantly higher quality than previously achievable. These parses are then used as input to a MaxEnt reranker, which achieves an f-score of 91.0% on sentences of length 100 or less.
The reranking parser uses a regularized MaxEnt reranker to select the best parse from the 50-best parses generated by a generative parsing model. The 50-best parser is a probabilistic parser that, on its own, produces high-quality parses. The maximum probability parse trees have an f-score of 0.897 on section 23 of the Penn Treebank, which is state-of-the-art. However, the 50 best parses often contain better parses in terms of f-score, and this paper describes a 50-best parsing algorithm with an oracle f-score of 96.8 on the same data.
The reranker selects the best parse from the 50-best list of possible parses for each sentence. Because the reranker only needs to consider a relatively small number of parses per sentence, it is not necessary to use dynamic programming, which allows features to be arbitrary functions of the parse trees. The reranker does not achieve the oracle f-score, but the parses it selects have an f-score of 91.0, which is considerably better than the maximum probability parses of the n-best parser.
The n-best parsing algorithm generates the n highest probability parses for each string, along with their probabilities. The number of parses is set to 50 for the experiments described here, but some simple sentences receive fewer than 50 parses. Each yield or terminal string in the training, development, and test data sets is mapped to such an n-best list of parse/probability pairs. The cross-validation scheme described in Collins (2000) was used to avoid training the n-best parser on the sentence it was being used to parse.
A feature extractor is a vector of m functions, where each function maps a parse to a real number. The reranking parser associates a parse with a score, which is a linear function of the feature values. The feature weight vector is estimated from the labeled training corpus. The correct parse for each sentence in the training data is known, and the reranker is trained to identify the parses with the highest f-scores.
The paper describes a method for recovering the n-best parses using a coarse-to-fine parsing approach. This approach first produces a crude version of the parse using coarse-grained dynamic programming states, then builds fine-grained analyses by splitting the most promising of coarse-grained states. This method is efficient and effective, as demonstrated by the results on the Penn Treebank.