Enhancing machine learning-based sentiment analysis through feature extraction techniques

Enhancing machine learning-based sentiment analysis through feature extraction techniques

February 14, 2024 | Noura A. Semary, Wesam Ahmed, Khalid Amin, Pawel Plawiak, Mohamed Hammad
This paper explores the impact of feature extraction techniques on the performance of sentiment analysis tasks. The authors evaluate six different feature extraction methods—Bag-of-Words (BOW), Word2Vector, N-gram, Term Frequency-Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global Vectors for Word Representation (GloVe)—using two datasets: Twitter US Airlines and Amazon musical instrument reviews. The study aims to provide a comprehensive analysis of these techniques to help researchers and practitioners select the most suitable method for their sentiment analysis projects. The experimental results show that TF-IDF outperforms other methods in terms of accuracy, achieving 99% accuracy on the Amazon dataset and 96% on the Twitter dataset. The study also highlights the importance of feature extraction in improving model performance and suggests that careful consideration of the choice of feature extraction method is crucial for effective sentiment analysis. The findings have significant implications for both academic research and practical applications, such as monitoring social media sites and improving customer service through sentiment analysis.This paper explores the impact of feature extraction techniques on the performance of sentiment analysis tasks. The authors evaluate six different feature extraction methods—Bag-of-Words (BOW), Word2Vector, N-gram, Term Frequency-Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global Vectors for Word Representation (GloVe)—using two datasets: Twitter US Airlines and Amazon musical instrument reviews. The study aims to provide a comprehensive analysis of these techniques to help researchers and practitioners select the most suitable method for their sentiment analysis projects. The experimental results show that TF-IDF outperforms other methods in terms of accuracy, achieving 99% accuracy on the Amazon dataset and 96% on the Twitter dataset. The study also highlights the importance of feature extraction in improving model performance and suggests that careful consideration of the choice of feature extraction method is crucial for effective sentiment analysis. The findings have significant implications for both academic research and practical applications, such as monitoring social media sites and improving customer service through sentiment analysis.
Reach us at info@study.space