2002 | Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake, Dan Flickinger
The paper "Multiword Expressions: A Pain in the Neck for NLP" by Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake, and Dan Flickinger discusses the challenges posed by multiword expressions (MWEs) in natural language processing (NLP). MWEs are defined as "idiosyncratic interpretations that cross word boundaries," and they are significant because they can outnumber single words in a speaker's lexicon. The authors highlight two key problems: disambiguation and the nature of MWEs. Disambiguation is crucial for linguistic precision, but it is often approached using stochastic methods. The second problem, MWEs, is underappreciated and poses significant challenges for NLP systems, including overgeneration and idiomaticity issues. The paper suggests that a comprehensive analysis of MWEs must combine symbolic and statistical techniques to effectively address these challenges.The paper "Multiword Expressions: A Pain in the Neck for NLP" by Ivan A. Sag, Timothy Baldwin, Francis Bond, Ann Copestake, and Dan Flickinger discusses the challenges posed by multiword expressions (MWEs) in natural language processing (NLP). MWEs are defined as "idiosyncratic interpretations that cross word boundaries," and they are significant because they can outnumber single words in a speaker's lexicon. The authors highlight two key problems: disambiguation and the nature of MWEs. Disambiguation is crucial for linguistic precision, but it is often approached using stochastic methods. The second problem, MWEs, is underappreciated and poses significant challenges for NLP systems, including overgeneration and idiomaticity issues. The paper suggests that a comprehensive analysis of MWEs must combine symbolic and statistical techniques to effectively address these challenges.