July 2018 | PAULA FORTUNA, INESC TEC SÉRGIO NUNES, INESC TEC and Faculty of Engineering, University of Porto
This survey provides a comprehensive overview of the current state of automatic hate speech detection in text, a field that has gained significant attention in recent years. The authors organize and describe previous approaches, including core algorithms, methods, and main features used. They discuss the complexity of defining hate speech, which varies across platforms and contexts, and propose a unifying definition. The survey highlights the societal impact of hate speech, particularly in online communities and digital media platforms, and emphasizes the importance of shared resources such as guidelines, annotated datasets, and algorithms for advancing the field.
The survey is structured into several sections. The first section introduces the problem, its motivations, and the need for automatic detection tools. The second section reviews related work, noting the limited number of studies and the complementary nature of this survey compared to previous work. The third section delves into the theoretical aspects, comparing different definitions of hate speech and discussing its evolution and targets. The fourth section presents a systematic literature review, detailing the methodology, results, and findings on the evolution of the field, publication venues, and key concepts. The fifth section analyzes the types of hate speech and examples, while the sixth section discusses related concepts such as cyberbullying, discrimination, and toxicity. The seventh section focuses on the practical aspects, including datasets, algorithms, and feature extraction methods. The eighth section concludes with a summary of the main contributions and future perspectives.
The survey underscores the importance of shared resources and the need for further research to improve the accuracy and effectiveness of automatic hate speech detection systems.This survey provides a comprehensive overview of the current state of automatic hate speech detection in text, a field that has gained significant attention in recent years. The authors organize and describe previous approaches, including core algorithms, methods, and main features used. They discuss the complexity of defining hate speech, which varies across platforms and contexts, and propose a unifying definition. The survey highlights the societal impact of hate speech, particularly in online communities and digital media platforms, and emphasizes the importance of shared resources such as guidelines, annotated datasets, and algorithms for advancing the field.
The survey is structured into several sections. The first section introduces the problem, its motivations, and the need for automatic detection tools. The second section reviews related work, noting the limited number of studies and the complementary nature of this survey compared to previous work. The third section delves into the theoretical aspects, comparing different definitions of hate speech and discussing its evolution and targets. The fourth section presents a systematic literature review, detailing the methodology, results, and findings on the evolution of the field, publication venues, and key concepts. The fifth section analyzes the types of hate speech and examples, while the sixth section discusses related concepts such as cyberbullying, discrimination, and toxicity. The seventh section focuses on the practical aspects, including datasets, algorithms, and feature extraction methods. The eighth section concludes with a summary of the main contributions and future perspectives.
The survey underscores the importance of shared resources and the need for further research to improve the accuracy and effectiveness of automatic hate speech detection systems.