This paper presents a study on deep learning methods for hate speech detection in tweets. The task is to classify tweets as racist, sexist, or neither. The authors compare deep learning approaches with traditional methods like char n-grams, TF-IDF, and Bag of Words. They experiment with multiple deep learning architectures, including FastText, Convolutional Neural Networks (CNNs), and Long Short-Term Memory Networks (LSTMs), to learn semantic word embeddings for better classification. The results show that deep learning methods outperform traditional methods by about 18 F1 points on a benchmark dataset of 16,000 annotated tweets.
The authors propose a framework that uses various neural network architectures for hate speech detection. They compare their methods with baseline approaches and find that their proposed methods significantly outperform the baselines. Among the proposed methods, the best performance is achieved by the "LSTM + Random Embedding + GBDT" approach, where the LSTM is trained using back-propagation and the learned embeddings are used to train a Gradient Boosted Decision Tree (GBDT) classifier.
The study also shows that embeddings learned from deep neural networks are more effective in capturing the task-specific bias compared to traditional embeddings like GloVe. The authors conclude that deep learning methods significantly outperform existing approaches for hate speech detection and suggest future research into the importance of user network features for the task.This paper presents a study on deep learning methods for hate speech detection in tweets. The task is to classify tweets as racist, sexist, or neither. The authors compare deep learning approaches with traditional methods like char n-grams, TF-IDF, and Bag of Words. They experiment with multiple deep learning architectures, including FastText, Convolutional Neural Networks (CNNs), and Long Short-Term Memory Networks (LSTMs), to learn semantic word embeddings for better classification. The results show that deep learning methods outperform traditional methods by about 18 F1 points on a benchmark dataset of 16,000 annotated tweets.
The authors propose a framework that uses various neural network architectures for hate speech detection. They compare their methods with baseline approaches and find that their proposed methods significantly outperform the baselines. Among the proposed methods, the best performance is achieved by the "LSTM + Random Embedding + GBDT" approach, where the LSTM is trained using back-propagation and the learned embeddings are used to train a Gradient Boosted Decision Tree (GBDT) classifier.
The study also shows that embeddings learned from deep neural networks are more effective in capturing the task-specific bias compared to traditional embeddings like GloVe. The authors conclude that deep learning methods significantly outperform existing approaches for hate speech detection and suggest future research into the importance of user network features for the task.