Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study

Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study

24 Feb 2024 | Zhaoyue Sun, Gabriele Pergola, Byron C. Wallace and Yulan He
This paper explores the use of ChatGPT in pharmacovigilance event extraction, focusing on its performance in identifying and extracting adverse events or potential therapeutic events from medical text. The study evaluates ChatGPT's ability in zero-shot and few-shot scenarios, comparing it with smaller fine-tuned models. While ChatGPT shows reasonable performance with appropriate demonstration selection strategies, it still falls short compared to fully fine-tuned models. The research also investigates the potential of using ChatGPT for data augmentation, but finds that including synthesized data may lead to performance degradation due to noise in the generated labels. Filtering strategies are explored to improve the stability of performance, though constant improvement remains challenging. The study compares various prompting strategies for ChatGPT, including zero-shot prompting with detailed explanations of the event schema and few-shot prompting with different in-context selection strategies. Results show that while providing explanations improves performance for some arguments, all approaches struggle with 'population' extraction. The study also evaluates the performance of different models, including Flan-T5 and UIE, and finds that fine-tuned models outperform ChatGPT in most cases. Data augmentation with ChatGPT is shown to have limited effectiveness, and filtering strategies are introduced to improve data quality and reduce performance variance. The paper also discusses the limitations of using ChatGPT in pharmacovigilance event extraction, including the challenges of capturing intricate annotation rules and the potential for model bias when using synthetic data. The study emphasizes the structural complexity and fine granularity of arguments in event extraction, which may pose challenges in generating synthetic data. Future research could explore more diverse data augmentation strategies and the incorporation of annotations in the selection process to improve ChatGPT's performance in pharmacovigilance event extraction. The study concludes that while ChatGPT shows promise in few-shot learning, fine-tuned models remain more effective in the presence of abundant data.This paper explores the use of ChatGPT in pharmacovigilance event extraction, focusing on its performance in identifying and extracting adverse events or potential therapeutic events from medical text. The study evaluates ChatGPT's ability in zero-shot and few-shot scenarios, comparing it with smaller fine-tuned models. While ChatGPT shows reasonable performance with appropriate demonstration selection strategies, it still falls short compared to fully fine-tuned models. The research also investigates the potential of using ChatGPT for data augmentation, but finds that including synthesized data may lead to performance degradation due to noise in the generated labels. Filtering strategies are explored to improve the stability of performance, though constant improvement remains challenging. The study compares various prompting strategies for ChatGPT, including zero-shot prompting with detailed explanations of the event schema and few-shot prompting with different in-context selection strategies. Results show that while providing explanations improves performance for some arguments, all approaches struggle with 'population' extraction. The study also evaluates the performance of different models, including Flan-T5 and UIE, and finds that fine-tuned models outperform ChatGPT in most cases. Data augmentation with ChatGPT is shown to have limited effectiveness, and filtering strategies are introduced to improve data quality and reduce performance variance. The paper also discusses the limitations of using ChatGPT in pharmacovigilance event extraction, including the challenges of capturing intricate annotation rules and the potential for model bias when using synthetic data. The study emphasizes the structural complexity and fine granularity of arguments in event extraction, which may pose challenges in generating synthetic data. Future research could explore more diverse data augmentation strategies and the incorporation of annotations in the selection process to improve ChatGPT's performance in pharmacovigilance event extraction. The study concludes that while ChatGPT shows promise in few-shot learning, fine-tuned models remain more effective in the presence of abundant data.
Reach us at info@study.space