[slides and audio] Three Generative%2C Lexicalised Models for Statistical Parsing

This paper introduces three new generative, lexicalised models for statistical parsing. The first model is a generative version of the statistical parsing model proposed by Collins (1996). The second model extends the parser to include probabilities over subcategorisation frames for head-words, allowing for a distinction between complements and adjuncts. The third model incorporates a probabilistic treatment of wh-movement, derived from Generalized Phrase Structure Grammar (Gazdar et al., 1995). These models improve parsing performance, achieving 88.1/87.5% constituent precision/recall on Wall Street Journal text, a 2.3% improvement over Collins' (1996) model. The paper also discusses practical issues such as smoothing and unknown words, and compares the models' performance using PARSEVAL measures. The results show that the new models not only enhance parsing accuracy but also provide more informative output, making them valuable for NLP applications.This paper introduces three new generative, lexicalised models for statistical parsing. The first model is a generative version of the statistical parsing model proposed by Collins (1996). The second model extends the parser to include probabilities over subcategorisation frames for head-words, allowing for a distinction between complements and adjuncts. The third model incorporates a probabilistic treatment of wh-movement, derived from Generalized Phrase Structure Grammar (Gazdar et al., 1995). These models improve parsing performance, achieving 88.1/87.5% constituent precision/recall on Wall Street Journal text, a 2.3% improvement over Collins' (1996) model. The paper also discusses practical issues such as smoothing and unknown words, and compares the models' performance using PARSEVAL measures. The results show that the new models not only enhance parsing accuracy but also provide more informative output, making them valuable for NLP applications.

Three Generative, Lexicalised Models for Statistical Parsing

17 Jun 1997 | Michael Collins