Is a Large Language Model a Good Annotator for Event Extraction?

Is a Large Language Model a Good Annotator for Event Extraction?

2024 | Ruiri Chen, Chengwei Qin, Weifeng Jiang, Dongkyu Choi
Is a Large Language Model a Good Annotator for Event Extraction? Ruirui Chen, Chengwei Qin, Weifeng Jiang, Dongkyu Choi Abstract: Event extraction is a key task in natural language processing that involves mining event-related information from unstructured text. Despite progress, challenges like data scarcity and imbalance hinder performance. This paper introduces using large language models (LLMs) as expert annotators for event extraction. By including training data samples in prompts, we align LLM-generated samples with benchmark datasets, creating an augmented dataset that improves fine-tuned model performance. Extensive experiments validate the method's effectiveness, showing potential for advancing event extraction systems. Introduction: Event extraction is crucial in NLP, involving identifying events and extracting arguments. Despite progress, performance remains suboptimal due to limited labeled data, especially for argument extraction. The ACE 2005 dataset, for example, has few labeled samples for some event types. Recent studies show LLMs struggle with event extraction, but this paper proposes using LLMs to generate additional data, addressing data scarcity. Contributions include evaluating LLMs on benchmark datasets, proposing an LLM-based annotation approach, and releasing annotated samples to mitigate long-tail issues. Related Work: LLMs have shown success in various NLP tasks but face challenges in event extraction. Event extraction methods include classification, sequence labeling, span prediction, and conditional generation. Data scarcity remains a challenge, with datasets like ACE 2005 having few labeled samples for some event types. Data augmentation methods, such as generating new data, have been explored to address this. The Proposed Method: We propose using LLMs for event extraction in two ways: directly prompting them to extract event information or using them as annotators to enhance fine-tuned models. We tested LLMs on zero-shot and one-shot settings, finding that they struggled with event identification. We then used LLMs to generate additional data, improving model performance. Prompting LLMs for Event Extraction: We tested LLMs on zero-shot and one-shot settings, finding that they struggled with event identification. We then used LLMs to generate additional data, improving model performance. Empowering Event Extraction with LLM-based Annotators: We evaluated LLMs on benchmark datasets, finding that they struggled with event extraction. We then used LLMs to generate additional data, improving model performance. Experimental Setup: We evaluated LLMs on ACE 2005 and MAVEN datasets. We tested various LLMs, including GPT-3.5-Turbo, GPT-4, and PaLM. We also tested baseline models like BERT+CRF and DMBERT. Experimental Results: LLMs showed mixed performance in event extraction. While they generated labeled data, their predictions were sometimes inaccurate. Fine-tuned models outperformed LLMs in event extraction. However, LLMs generatedIs a Large Language Model a Good Annotator for Event Extraction? Ruirui Chen, Chengwei Qin, Weifeng Jiang, Dongkyu Choi Abstract: Event extraction is a key task in natural language processing that involves mining event-related information from unstructured text. Despite progress, challenges like data scarcity and imbalance hinder performance. This paper introduces using large language models (LLMs) as expert annotators for event extraction. By including training data samples in prompts, we align LLM-generated samples with benchmark datasets, creating an augmented dataset that improves fine-tuned model performance. Extensive experiments validate the method's effectiveness, showing potential for advancing event extraction systems. Introduction: Event extraction is crucial in NLP, involving identifying events and extracting arguments. Despite progress, performance remains suboptimal due to limited labeled data, especially for argument extraction. The ACE 2005 dataset, for example, has few labeled samples for some event types. Recent studies show LLMs struggle with event extraction, but this paper proposes using LLMs to generate additional data, addressing data scarcity. Contributions include evaluating LLMs on benchmark datasets, proposing an LLM-based annotation approach, and releasing annotated samples to mitigate long-tail issues. Related Work: LLMs have shown success in various NLP tasks but face challenges in event extraction. Event extraction methods include classification, sequence labeling, span prediction, and conditional generation. Data scarcity remains a challenge, with datasets like ACE 2005 having few labeled samples for some event types. Data augmentation methods, such as generating new data, have been explored to address this. The Proposed Method: We propose using LLMs for event extraction in two ways: directly prompting them to extract event information or using them as annotators to enhance fine-tuned models. We tested LLMs on zero-shot and one-shot settings, finding that they struggled with event identification. We then used LLMs to generate additional data, improving model performance. Prompting LLMs for Event Extraction: We tested LLMs on zero-shot and one-shot settings, finding that they struggled with event identification. We then used LLMs to generate additional data, improving model performance. Empowering Event Extraction with LLM-based Annotators: We evaluated LLMs on benchmark datasets, finding that they struggled with event extraction. We then used LLMs to generate additional data, improving model performance. Experimental Setup: We evaluated LLMs on ACE 2005 and MAVEN datasets. We tested various LLMs, including GPT-3.5-Turbo, GPT-4, and PaLM. We also tested baseline models like BERT+CRF and DMBERT. Experimental Results: LLMs showed mixed performance in event extraction. While they generated labeled data, their predictions were sometimes inaccurate. Fine-tuned models outperformed LLMs in event extraction. However, LLMs generated
Reach us at info@futurestudyspace.com
[slides and audio] Is a Large Language Model a Good Annotator for Event Extraction%3F