28 September 2010 | Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, Manfred Stede
The article presents a lexicon-based approach to sentiment analysis, specifically the Semantic Orientation CALculator (SO-CAL), which uses dictionaries of words annotated with their semantic orientation (polarity and strength) and incorporates intensification and negation. SO-CAL is applied to polarity classification, assigning positive or negative labels to texts based on their main subject matter. The authors demonstrate that SO-CAL performs consistently across domains and on unseen data. They also describe the process of creating dictionaries and using Mechanical Turk to ensure their reliability and consistency.
SO-CAL calculates sentiment by considering the semantic orientation of individual words and contextual valence shifters, such as negation and intensification. The system handles negation and intensification in a way that generalizes to all words with semantic orientation values. The article discusses the creation of dictionaries for different parts of speech, including adjectives, nouns, verbs, and adverbs, and the incorporation of valence shifters.
The evaluation of SO-CAL's features is conducted using various data sets, including Epinions review texts, movie reviews, and camera product reviews. The results show that SO-CAL outperforms other dictionaries and performs well across different domains and unseen texts. The article also highlights the importance of hand-ranked, fine-grained, multiple-part-of-speech dictionaries for effective sentiment analysis.The article presents a lexicon-based approach to sentiment analysis, specifically the Semantic Orientation CALculator (SO-CAL), which uses dictionaries of words annotated with their semantic orientation (polarity and strength) and incorporates intensification and negation. SO-CAL is applied to polarity classification, assigning positive or negative labels to texts based on their main subject matter. The authors demonstrate that SO-CAL performs consistently across domains and on unseen data. They also describe the process of creating dictionaries and using Mechanical Turk to ensure their reliability and consistency.
SO-CAL calculates sentiment by considering the semantic orientation of individual words and contextual valence shifters, such as negation and intensification. The system handles negation and intensification in a way that generalizes to all words with semantic orientation values. The article discusses the creation of dictionaries for different parts of speech, including adjectives, nouns, verbs, and adverbs, and the incorporation of valence shifters.
The evaluation of SO-CAL's features is conducted using various data sets, including Epinions review texts, movie reviews, and camera product reviews. The results show that SO-CAL outperforms other dictionaries and performs well across different domains and unseen texts. The article also highlights the importance of hand-ranked, fine-grained, multiple-part-of-speech dictionaries for effective sentiment analysis.