[slides and audio] On Zero-Shot Counterspeech Generation by LLMs

This paper investigates the performance of four Large Language Models (LLMs)—GPT-2, DialoGPT, ChatGPT, and FlanT5—in zero-shot counterspeech generation. The study is the first of its kind to explore the intrinsic properties of these models in this context. The authors evaluate the models on four datasets: CONAN, CONAN-MT, Reddit, and Gab, and analyze the impact of model size and prompting strategies on the quality and toxicity of the generated counterspeech. Key findings include: 1. **Overall Performance**: ChatGPT outperforms other models in terms of generation metrics (Gleu, Meteor, Bleurt) by 12%, 32%, and 42.25%, respectively. However, it reduces readability by 35%. 2. **Effect of Model Size**: Increasing the size of models (small, medium, large) leads to a significant increase in toxicity (44% for CONAN-MT, 25% for Reddit, 30% for Gab). 3. **Effect of Prompt Type**: Manual prompts perform better for denouncing, facts, and humor types, while cluster-centered prompts are effective for affiliation type counterspeech in GPT-2 and DialoGPT, and frequency-based prompts are better for FlanT5 and ChatGPT. The paper also proposes three prompting strategies—manual, frequency-based, and cluster-centered—to improve type-specific counterspeech generation. The results highlight the importance of carefully designed prompts and the need for further research to enhance model capabilities in this domain.This paper investigates the performance of four Large Language Models (LLMs)—GPT-2, DialoGPT, ChatGPT, and FlanT5—in zero-shot counterspeech generation. The study is the first of its kind to explore the intrinsic properties of these models in this context. The authors evaluate the models on four datasets: CONAN, CONAN-MT, Reddit, and Gab, and analyze the impact of model size and prompting strategies on the quality and toxicity of the generated counterspeech. Key findings include: 1. **Overall Performance**: ChatGPT outperforms other models in terms of generation metrics (Gleu, Meteor, Bleurt) by 12%, 32%, and 42.25%, respectively. However, it reduces readability by 35%. 2. **Effect of Model Size**: Increasing the size of models (small, medium, large) leads to a significant increase in toxicity (44% for CONAN-MT, 25% for Reddit, 30% for Gab). 3. **Effect of Prompt Type**: Manual prompts perform better for denouncing, facts, and humor types, while cluster-centered prompts are effective for affiliation type counterspeech in GPT-2 and DialoGPT, and frequency-based prompts are better for FlanT5 and ChatGPT. The paper also proposes three prompting strategies—manual, frequency-based, and cluster-centered—to improve type-specific counterspeech generation. The results highlight the importance of carefully designed prompts and the need for further research to enhance model capabilities in this domain.

On Zero-Shot Counterspeech Generation by LLMs

22 Mar 2024 | Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee