ChatGPT outperforms crowd workers for text-annotation tasks

ChatGPT outperforms crowd workers for text-annotation tasks

July 18, 2023 | Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli
A study shows that ChatGPT outperforms crowd workers on text-annotation tasks, including relevance, stance, topics, and frame detection. Using 6,183 tweets and news articles, the research found that ChatGPT's zero-shot accuracy exceeds that of crowd workers by about 25 percentage points on average. ChatGPT also shows higher intercoder agreement than both crowd workers and trained annotators. Additionally, ChatGPT is significantly cheaper than MTurk, with a per-annotation cost of about $0.003, thirty times cheaper than MTurk. These results demonstrate the potential of large language models to greatly increase the efficiency of text classification. The study compared ChatGPT's performance with that of crowd workers and trained annotators on four datasets. ChatGPT's accuracy was higher than that of MTurk for most tasks, and its intercoder agreement was higher than that of both MTurk and trained annotators. The study also found that ChatGPT's performance was positively correlated with the intercoder agreement of trained annotators, suggesting better performance for easier tasks. Conversely, ChatGPT's outperformance of MTurk was negatively correlated with the intercoder agreement of trained annotators, potentially indicating stronger overperformance for more complex tasks. The study highlights the potential of large language models to transform text-annotation procedures for a variety of tasks common to many research projects. The findings suggest that ChatGPT may already be a superior approach compared to crowd annotations on platforms such as MTurk. The study also raises questions about the performance of LLMs across multiple languages, the implementation of few-shot learning, the construction of semiautomated data labeling systems, and the use of chain of thought prompting to increase the performance of zero-shot reasoning.A study shows that ChatGPT outperforms crowd workers on text-annotation tasks, including relevance, stance, topics, and frame detection. Using 6,183 tweets and news articles, the research found that ChatGPT's zero-shot accuracy exceeds that of crowd workers by about 25 percentage points on average. ChatGPT also shows higher intercoder agreement than both crowd workers and trained annotators. Additionally, ChatGPT is significantly cheaper than MTurk, with a per-annotation cost of about $0.003, thirty times cheaper than MTurk. These results demonstrate the potential of large language models to greatly increase the efficiency of text classification. The study compared ChatGPT's performance with that of crowd workers and trained annotators on four datasets. ChatGPT's accuracy was higher than that of MTurk for most tasks, and its intercoder agreement was higher than that of both MTurk and trained annotators. The study also found that ChatGPT's performance was positively correlated with the intercoder agreement of trained annotators, suggesting better performance for easier tasks. Conversely, ChatGPT's outperformance of MTurk was negatively correlated with the intercoder agreement of trained annotators, potentially indicating stronger overperformance for more complex tasks. The study highlights the potential of large language models to transform text-annotation procedures for a variety of tasks common to many research projects. The findings suggest that ChatGPT may already be a superior approach compared to crowd annotations on platforms such as MTurk. The study also raises questions about the performance of LLMs across multiple languages, the implementation of few-shot learning, the construction of semiautomated data labeling systems, and the use of chain of thought prompting to increase the performance of zero-shot reasoning.
Reach us at info@futurestudyspace.com
[slides] ChatGPT outperforms crowd workers for text-annotation tasks | StudySpace