Deep Learning for Hate Speech Detection in Tweets

Deep Learning for Hate Speech Detection in Tweets

1 Jun 2017 | Pinkesh Badjatiya1††, Shashank Gupta1††, Manish Gupta1,2, Vasudeva Varma1
This paper explores the application of deep learning methods for hate speech detection on Twitter. The authors define the task as classifying tweets as racist, sexist, or neither, highlighting the complexity of natural language constructs. They experiment with multiple deep learning architectures, including FastText, Convolutional Neural Networks (CNNs), and Long Short-Term Memory Networks (LSTMs), to learn semantic word embeddings. The experiments are conducted on a benchmark dataset of 16K annotated tweets, and the results show that deep learning methods outperform state-of-the-art char/word n-gram methods by approximately 18 F1 points. The main contributions include the investigation of deep learning for hate speech detection, the exploration of various tweet semantic embeddings, and the significant improvement in accuracy. The best method, "LSTM + Random Embedding + GBDT," achieved the highest F1 score, demonstrating the effectiveness of task-specific embeddings combined with gradient boosted decision trees.This paper explores the application of deep learning methods for hate speech detection on Twitter. The authors define the task as classifying tweets as racist, sexist, or neither, highlighting the complexity of natural language constructs. They experiment with multiple deep learning architectures, including FastText, Convolutional Neural Networks (CNNs), and Long Short-Term Memory Networks (LSTMs), to learn semantic word embeddings. The experiments are conducted on a benchmark dataset of 16K annotated tweets, and the results show that deep learning methods outperform state-of-the-art char/word n-gram methods by approximately 18 F1 points. The main contributions include the investigation of deep learning for hate speech detection, the exploration of various tweet semantic embeddings, and the significant improvement in accuracy. The best method, "LSTM + Random Embedding + GBDT," achieved the highest F1 score, demonstrating the effectiveness of task-specific embeddings combined with gradient boosted decision trees.
Reach us at info@study.space
Understanding Deep Learning for Hate Speech Detection in Tweets