23 June 2011 | Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, Rebecca Passonneau
This paper examines sentiment analysis on Twitter data, introducing POS-specific prior polarity features and exploring the use of a tree kernel to simplify feature engineering. The authors build models for classifying tweets into positive, negative, and neutral sentiments, using unigram, feature-based, and tree kernel models. They find that the tree kernel model outperforms both the unigram baseline and the feature-based model, achieving significant improvements. The paper also presents extensive feature analysis, showing that features combining prior polarity with parts-of-speech tags are most important. Additionally, the authors introduce new resources, including an emoticon dictionary and an acronym dictionary, and provide a detailed description of their data preprocessing techniques. The results demonstrate that standard natural language processing tools can be effective even in the context of microblogging, and that tree kernels can perform as well as feature-based models without detailed feature engineering.This paper examines sentiment analysis on Twitter data, introducing POS-specific prior polarity features and exploring the use of a tree kernel to simplify feature engineering. The authors build models for classifying tweets into positive, negative, and neutral sentiments, using unigram, feature-based, and tree kernel models. They find that the tree kernel model outperforms both the unigram baseline and the feature-based model, achieving significant improvements. The paper also presents extensive feature analysis, showing that features combining prior polarity with parts-of-speech tags are most important. Additionally, the authors introduce new resources, including an emoticon dictionary and an acronym dictionary, and provide a detailed description of their data preprocessing techniques. The results demonstrate that standard natural language processing tools can be effective even in the context of microblogging, and that tree kernels can perform as well as feature-based models without detailed feature engineering.