Sentiment Analysis using Support Vector Machine and Random Forest

Sentiment Analysis using Support Vector Machine and Random Forest

16 February 2024 | Talha Ahmed Khan¹*, Rehan Sadiq², Zeeshan Shahid³, Muhammad Mansoor Alam⁴, Mazliham Bin Mohd Su'ud⁵
This paper presents an in-depth survey of sentiment analysis, focusing on the application of machine learning techniques, particularly Support Vector Machine (SVM) and Random Forest. The study covers preprocessing techniques, feature extraction, model training, evaluation, and challenges in sentiment analysis. The findings contribute to a deeper understanding of sentiment analysis and provide insights into the effectiveness of machine learning approaches in this domain. The study evaluates the performance of SVM and Random Forest algorithms on a classification task. The Random Forest algorithm achieved an accuracy of 0.78564, while SVM outperformed it slightly with an accuracy of 0.80394. Both algorithms demonstrated their strengths in achieving respectable accuracies in the given classification task. These results suggest that SVM, with its slightly higher accuracy of 0.80394, may be a more suitable choice when accuracy is the primary concern. However, the basic configuration and characteristics of the problem at hand should be considered when choosing the better algorithm. The paper also discusses preprocessing techniques such as tokenization, stop word removal, stemming, lemmatization, and handling special characters, URLs, and HTML tags. It explores various feature extraction methods, including the Bag of Words (BoW) model, Term Frequency-Inverse Document Frequency (TF-IDF), and word embeddings. The study compares existing approaches in sentiment analysis, including sentiment dictionary-based and machine learning-based methods. It also discusses the implementation of SVM and Random Forest algorithms for text classification, highlighting their strengths and weaknesses. The results show that SVM outperformed Random Forest in terms of accuracy, recall, precision, and F1-score. The paper concludes that SVM is a more suitable choice for sentiment analysis due to its higher accuracy and ability to handle complex feature spaces. The study also emphasizes the importance of preprocessing and feature extraction in improving the performance of sentiment analysis models. The results demonstrate the effectiveness of SVM in capturing complex relationships between words and sentiments, making it a valuable tool for sentiment analysis. The paper also discusses the challenges in sentiment analysis, such as sarcasm, irony, context-dependent sentiment, and handling noisy or imbalanced datasets. The study highlights the importance of addressing these challenges to achieve accurate sentiment classification. Overall, the paper provides a comprehensive overview of sentiment analysis, emphasizing the role of machine learning techniques in this field.This paper presents an in-depth survey of sentiment analysis, focusing on the application of machine learning techniques, particularly Support Vector Machine (SVM) and Random Forest. The study covers preprocessing techniques, feature extraction, model training, evaluation, and challenges in sentiment analysis. The findings contribute to a deeper understanding of sentiment analysis and provide insights into the effectiveness of machine learning approaches in this domain. The study evaluates the performance of SVM and Random Forest algorithms on a classification task. The Random Forest algorithm achieved an accuracy of 0.78564, while SVM outperformed it slightly with an accuracy of 0.80394. Both algorithms demonstrated their strengths in achieving respectable accuracies in the given classification task. These results suggest that SVM, with its slightly higher accuracy of 0.80394, may be a more suitable choice when accuracy is the primary concern. However, the basic configuration and characteristics of the problem at hand should be considered when choosing the better algorithm. The paper also discusses preprocessing techniques such as tokenization, stop word removal, stemming, lemmatization, and handling special characters, URLs, and HTML tags. It explores various feature extraction methods, including the Bag of Words (BoW) model, Term Frequency-Inverse Document Frequency (TF-IDF), and word embeddings. The study compares existing approaches in sentiment analysis, including sentiment dictionary-based and machine learning-based methods. It also discusses the implementation of SVM and Random Forest algorithms for text classification, highlighting their strengths and weaknesses. The results show that SVM outperformed Random Forest in terms of accuracy, recall, precision, and F1-score. The paper concludes that SVM is a more suitable choice for sentiment analysis due to its higher accuracy and ability to handle complex feature spaces. The study also emphasizes the importance of preprocessing and feature extraction in improving the performance of sentiment analysis models. The results demonstrate the effectiveness of SVM in capturing complex relationships between words and sentiments, making it a valuable tool for sentiment analysis. The paper also discusses the challenges in sentiment analysis, such as sarcasm, irony, context-dependent sentiment, and handling noisy or imbalanced datasets. The study highlights the importance of addressing these challenges to achieve accurate sentiment classification. Overall, the paper provides a comprehensive overview of sentiment analysis, emphasizing the role of machine learning techniques in this field.
Reach us at info@study.space
[slides and audio] Sentiment Analysis using Support Vector Machine and Random Forest