27 March 2024 | Amr Mohamed El Koshiry, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez, Marwa Khairy
This study investigates the effectiveness of various deep learning and classical machine learning techniques in identifying instances of cyberbullying. The research compares the performance of five classical machine learning algorithms (Multinomial Naive Bayes, Logistic Regression, Support Vector Classifier, Decision Tree, and Random Forest) and three deep learning models (LSTM, Bi-LSTM, and GRU). The data is pre-processed, including text cleaning, tokenization, stemming, and stop word removal. The performance of the algorithms is evaluated using metrics such as accuracy, precision, recall, and F1 score. The results show that the proposed technique, which combines pre-trained GloVe word embeddings and the focal loss function within a deep learning model, achieves high accuracy, precision, and F1 scores. Specifically, the GRU algorithm achieved the highest accuracy of 97.0%, while the NB algorithm achieved the highest precision of 96.6%. The study also highlights the limitations of the current approach, such as the need for further evaluation on diverse datasets to assess generalizability.This study investigates the effectiveness of various deep learning and classical machine learning techniques in identifying instances of cyberbullying. The research compares the performance of five classical machine learning algorithms (Multinomial Naive Bayes, Logistic Regression, Support Vector Classifier, Decision Tree, and Random Forest) and three deep learning models (LSTM, Bi-LSTM, and GRU). The data is pre-processed, including text cleaning, tokenization, stemming, and stop word removal. The performance of the algorithms is evaluated using metrics such as accuracy, precision, recall, and F1 score. The results show that the proposed technique, which combines pre-trained GloVe word embeddings and the focal loss function within a deep learning model, achieves high accuracy, precision, and F1 scores. Specifically, the GRU algorithm achieved the highest accuracy of 97.0%, while the NB algorithm achieved the highest precision of 96.6%. The study also highlights the limitations of the current approach, such as the need for further evaluation on diverse datasets to assess generalizability.