FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs

FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs

8 Feb 2024 | EUN CHEOL CHOI, EMILIO FERRARA
FACT-GPT is a system that uses Large Language Models (LLMs) to automate the claim matching stage in fact-checking. It is trained on a synthetic dataset to identify social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. The system's evaluation shows that specialized LLMs can match the accuracy of larger models in identifying related claims, closely mirroring human judgment. This research provides an automated solution for efficient claim matching, demonstrates the potential of LLMs in supporting fact-checkers, and offers valuable resources for further research in the field. The paper explores the potential of LLMs to support the claim matching stage in the fact-checking process. It reveals that when fine-tuned appropriately, LLMs can effectively match claims. The framework could benefit fact-checkers by minimizing redundant verification, support online platforms in content moderation, and assist researchers in the extensive analysis of misinformation from a large corpus. The study focuses on misinformation related to public health, specifically COVID-19 related false claims that have been fact-checked. A synthetic training dataset was generated using LLMs to create a balanced dataset for claim matching tasks. The ground truth dataset was created by pairing tweets with debunked false claims and classifying them into three categories based on human annotations. The experiments compared the performance of several pre-trained LLMs against human annotations. The results showed that models fine-tuned on synthetic datasets exhibited superior performance in comparison to the pre-trained versions. However, the models showed struggles with categorizing posts that contradict debunked claims, indicating the need for further refinement. The study underscores the potential of LLMs in augmenting the fact-checking process, particularly during the claim matching phase. It demonstrates that appropriately fine-tuned, smaller LLMs can yield a performance comparable to larger models, offering a more accessible and cost-effective AI solution without compromising quality. The research adds substantively to a growing body of work examining the use of LLMs in support of human fact-checkers, offering a foundation for continued studies and the responsible advancement of AI tools to effectively combat the spread of misinformation.FACT-GPT is a system that uses Large Language Models (LLMs) to automate the claim matching stage in fact-checking. It is trained on a synthetic dataset to identify social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. The system's evaluation shows that specialized LLMs can match the accuracy of larger models in identifying related claims, closely mirroring human judgment. This research provides an automated solution for efficient claim matching, demonstrates the potential of LLMs in supporting fact-checkers, and offers valuable resources for further research in the field. The paper explores the potential of LLMs to support the claim matching stage in the fact-checking process. It reveals that when fine-tuned appropriately, LLMs can effectively match claims. The framework could benefit fact-checkers by minimizing redundant verification, support online platforms in content moderation, and assist researchers in the extensive analysis of misinformation from a large corpus. The study focuses on misinformation related to public health, specifically COVID-19 related false claims that have been fact-checked. A synthetic training dataset was generated using LLMs to create a balanced dataset for claim matching tasks. The ground truth dataset was created by pairing tweets with debunked false claims and classifying them into three categories based on human annotations. The experiments compared the performance of several pre-trained LLMs against human annotations. The results showed that models fine-tuned on synthetic datasets exhibited superior performance in comparison to the pre-trained versions. However, the models showed struggles with categorizing posts that contradict debunked claims, indicating the need for further refinement. The study underscores the potential of LLMs in augmenting the fact-checking process, particularly during the claim matching phase. It demonstrates that appropriately fine-tuned, smaller LLMs can yield a performance comparable to larger models, offering a more accessible and cost-effective AI solution without compromising quality. The research adds substantively to a growing body of work examining the use of LLMs in support of human fact-checkers, offering a foundation for continued studies and the responsible advancement of AI tools to effectively combat the spread of misinformation.
Reach us at info@study.space