22 Mar 2024 | Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee
This paper investigates the performance of four large language models (LLMs) - GPT-2, DialoGPT, FlanT5, and ChatGPT - in zero-shot settings for counterspeech generation. The study evaluates these models on four datasets: CONAN, CONAN-MT, Reddit, and Gab. The results show that ChatGPT outperforms the other models in terms of generation quality, with improvements in metrics like GLEU, METEOR, and BLEURT. However, toxicity increases with model size. The study also proposes three prompting strategies - manual, frequency-based, and cluster-centered - to generate different types of counterspeech. The analysis shows that these strategies improve performance across all models. The study finds that GPT-2 and FlanT5 are better in terms of counterspeech quality but have higher toxicity compared to DialoGPT. ChatGPT performs best across all metrics. The study also evaluates the engagement and quality of generated counterspeech, finding that ChatGPT's readability decreases, while its counterspeech quality improves. The study highlights the importance of prompt strategies in generating type-specific counterspeech and suggests that further research is needed to improve prompting strategies and model performance for this task. The study also discusses the ethical implications of using LLMs for counterspeech generation, emphasizing the need for human oversight and the potential risks of fully automated systems.This paper investigates the performance of four large language models (LLMs) - GPT-2, DialoGPT, FlanT5, and ChatGPT - in zero-shot settings for counterspeech generation. The study evaluates these models on four datasets: CONAN, CONAN-MT, Reddit, and Gab. The results show that ChatGPT outperforms the other models in terms of generation quality, with improvements in metrics like GLEU, METEOR, and BLEURT. However, toxicity increases with model size. The study also proposes three prompting strategies - manual, frequency-based, and cluster-centered - to generate different types of counterspeech. The analysis shows that these strategies improve performance across all models. The study finds that GPT-2 and FlanT5 are better in terms of counterspeech quality but have higher toxicity compared to DialoGPT. ChatGPT performs best across all metrics. The study also evaluates the engagement and quality of generated counterspeech, finding that ChatGPT's readability decreases, while its counterspeech quality improves. The study highlights the importance of prompt strategies in generating type-specific counterspeech and suggests that further research is needed to improve prompting strategies and model performance for this task. The study also discusses the ethical implications of using LLMs for counterspeech generation, emphasizing the need for human oversight and the potential risks of fully automated systems.