NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets

NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets

28 Aug 2013 | Saif M. Mohammad, Svetlana Kiritchenko, and Xiaodan Zhu
This paper presents two state-of-the-art SVM classifiers for sentiment analysis of tweets and SMS messages. The first classifier detects the sentiment of a message (message-level task), while the second detects the sentiment of individual terms within a message (term-level task). The classifiers were developed to participate in the SemEval-2013 competition, where they achieved top rankings. The message-level classifier achieved an F-score of 69.02, and the term-level classifier achieved an F-score of 88.93. The classifiers were trained on sentiment-labeled tweets provided by the competition organizers. The system uses a variety of features, including surface-form, semantic, and sentiment features. Two large word-sentiment association lexicons were generated: one from tweets with sentiment-word hashtags and one from tweets with emoticons. These lexicons significantly improved performance, with the lexicon-based features providing a gain of over 5 F-score points in the message-level task. For the message-level task, the system was tested on both tweets and SMS messages. It achieved an F-score of 68.46 on SMS messages, ranking first. For the term-level task, the system was tested on tweets and SMS messages, achieving an F-score of 88.00 on SMS messages, ranking second. The system uses a variety of features, including word ngrams, character ngrams, all-caps words, part-of-speech tags, hashtags, sentiment lexicons, and negation features. The sentiment lexicons were particularly effective, with the automatic sentiment lexicons (NRC Hashtag Sentiment Lexicon and Sentiment140 Lexicon) contributing significantly to performance on the tweet set. The term-level task showed significantly higher performance than the message-level task, largely due to the effectiveness of ngram features. The system achieved an F-score of 88.93 on the tweet set and 88.00 on the SMS set. The results demonstrate the effectiveness of the system in detecting sentiment in tweets and SMS messages.This paper presents two state-of-the-art SVM classifiers for sentiment analysis of tweets and SMS messages. The first classifier detects the sentiment of a message (message-level task), while the second detects the sentiment of individual terms within a message (term-level task). The classifiers were developed to participate in the SemEval-2013 competition, where they achieved top rankings. The message-level classifier achieved an F-score of 69.02, and the term-level classifier achieved an F-score of 88.93. The classifiers were trained on sentiment-labeled tweets provided by the competition organizers. The system uses a variety of features, including surface-form, semantic, and sentiment features. Two large word-sentiment association lexicons were generated: one from tweets with sentiment-word hashtags and one from tweets with emoticons. These lexicons significantly improved performance, with the lexicon-based features providing a gain of over 5 F-score points in the message-level task. For the message-level task, the system was tested on both tweets and SMS messages. It achieved an F-score of 68.46 on SMS messages, ranking first. For the term-level task, the system was tested on tweets and SMS messages, achieving an F-score of 88.00 on SMS messages, ranking second. The system uses a variety of features, including word ngrams, character ngrams, all-caps words, part-of-speech tags, hashtags, sentiment lexicons, and negation features. The sentiment lexicons were particularly effective, with the automatic sentiment lexicons (NRC Hashtag Sentiment Lexicon and Sentiment140 Lexicon) contributing significantly to performance on the tweet set. The term-level task showed significantly higher performance than the message-level task, largely due to the effectiveness of ngram features. The system achieved an F-score of 88.93 on the tweet set and 88.00 on the SMS set. The results demonstrate the effectiveness of the system in detecting sentiment in tweets and SMS messages.
Reach us at info@study.space
Understanding NRC-Canada%3A Building the State-of-the-Art in Sentiment Analysis of Tweets