8 Feb 2024 | Kristina Radivojevic, Nicholas Clark, Paul Brenner
The paper "LLMs Among Us: Generative AI Participating in Digital Discourse" by Kristina Radivojevic, Nicholas Clark, and Paul Brenner explores the impact of Large Language Models (LLMs) on social media platforms. The authors developed an experimental framework called "LLMs Among Us" on the Mastodon platform to study how human participants could distinguish between human and bot participants. They created 10 personas using three different LLMs—GPT-4, Llama 2 Chat, and Claude—and conducted three rounds of experiments with 36 human participants. Despite knowing the presence of both bots and humans, participants correctly identified the nature of other users only 42% of the time. The choice of persona had a significant impact on human perception, with Persona 8 being more likely to be identified as a bot compared to Personas 3 and 6. The study highlights the challenges of detecting LLM-generated content and the potential for LLMs to manipulate digital discourse. The authors also provide demographic analysis of the participants and discuss the limitations and future directions of the research.The paper "LLMs Among Us: Generative AI Participating in Digital Discourse" by Kristina Radivojevic, Nicholas Clark, and Paul Brenner explores the impact of Large Language Models (LLMs) on social media platforms. The authors developed an experimental framework called "LLMs Among Us" on the Mastodon platform to study how human participants could distinguish between human and bot participants. They created 10 personas using three different LLMs—GPT-4, Llama 2 Chat, and Claude—and conducted three rounds of experiments with 36 human participants. Despite knowing the presence of both bots and humans, participants correctly identified the nature of other users only 42% of the time. The choice of persona had a significant impact on human perception, with Persona 8 being more likely to be identified as a bot compared to Personas 3 and 6. The study highlights the challenges of detecting LLM-generated content and the potential for LLMs to manipulate digital discourse. The authors also provide demographic analysis of the participants and discuss the limitations and future directions of the research.