[slides] Learning Question Classifiers

This paper presents a machine learning approach to question classification (QC). The goal is to categorize questions into semantic classes that impose constraints on potential answers, enabling more efficient question answering. The authors develop a hierarchical classifier guided by a layered semantic hierarchy of answer types, allowing questions to be classified into fine-grained classes. The classifier is based on the SNoW learning architecture and uses a two-stage process: first, classifying questions into coarse classes, then into fine classes. The classifier uses a combination of syntactic and semantic features, including words, part-of-speech tags, named entities, and semantically related words. The features are extracted automatically and combined using a set of operators to form more complex features. The classifier is evaluated on a large collection of free-form questions from TREC 10, achieving high accuracy. The results show that the hierarchical classifier outperforms a flat classifier in terms of accuracy and efficiency. The study also highlights the importance of semantic features in achieving accurate classification. The paper concludes that machine learning approaches can effectively solve the question classification problem, and that further research is needed to improve the performance of classifiers in handling complex semantic tasks.This paper presents a machine learning approach to question classification (QC). The goal is to categorize questions into semantic classes that impose constraints on potential answers, enabling more efficient question answering. The authors develop a hierarchical classifier guided by a layered semantic hierarchy of answer types, allowing questions to be classified into fine-grained classes. The classifier is based on the SNoW learning architecture and uses a two-stage process: first, classifying questions into coarse classes, then into fine classes. The classifier uses a combination of syntactic and semantic features, including words, part-of-speech tags, named entities, and semantically related words. The features are extracted automatically and combined using a set of operators to form more complex features. The classifier is evaluated on a large collection of free-form questions from TREC 10, achieving high accuracy. The results show that the hierarchical classifier outperforms a flat classifier in terms of accuracy and efficiency. The study also highlights the importance of semantic features in achieving accurate classification. The paper concludes that machine learning approaches can effectively solve the question classification problem, and that further research is needed to improve the performance of classifiers in handling complex semantic tasks.

Learning Question Classifiers

| Xin Li, Dan Roth