[slides] Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning

This research introduces a novel approach to automating Systematic Literature Reviews (SLRs) using fine-tuned Large Language Models (LLMs). The study leverages the latest fine-tuning methodologies and open-sourced LLMs to automate the final stages of an SLR process, focusing on knowledge synthesis. The results maintain high factual accuracy in LLM responses and are validated through the replication of a PRISMA-conforming SLR. The research addresses the challenges of LLM hallucination and proposes mechanisms to track LLM responses to their sources, ensuring the reliability and integrity of the findings. The findings confirm the potential of fine-tuned LLMs in streamlining labor-intensive processes in literature reviews, advocating for updates to PRISMA reporting guidelines to incorporate AI-driven processes. The study broadens the application of AI-enhanced tools across various academic and research fields, setting a new standard for comprehensive and accurate literature reviews. LLM Fine-tuning for SLRs · SLR Automation · Retrieval-Augmented Generation for Research · Domain-Specific Model Training · Knowledge Synthesis AI · AI-Driven Research Synthesis · Literature Review Automation · Generative AI · AI-Enhanced Systematic Reviews · PRISMA and AI Integration Systematic Literature Reviews (SLRs) are crucial for academic research, integrating and synthesizing existing scholarly knowledge. However, traditional SLRs are manual and resource-intensive, leading to inefficiencies. The advent of AI systems like LLMs offers a transformative opportunity to automate information retrieval processes while maintaining factual fidelity. However, broad generalist pretraining of LLMs leads to domain-specific inaccuracies and hallucinations, which are critical issues for SLRs. This study proposes the creation of fine-tuned LLMs trained on specific academic papers to enhance domain-specific expertise and improve the accuracy of knowledge synthesis. The study contributes to the field of information retrieval by developing a methodical approach to converting selected academic papers into datasets for LLM fine-tuning. It demonstrates the effectiveness of the proposed framework through an empirical study that replicates a PRISMA-conforming SLR, serving as a gold standard for validation. The literature review examines the SLR process, focusing on the synthesis phase and the integration of LLMs and AI. It addresses the challenges of current SLR automation, the role of LLMs, and the need for factual accuracy. The review also explores the potential of fine-tuning domain-specific LLMs for SLR tasks. Recent advancements in AI, NLP, and machine learning have led to the development of automation tools for literature reviews. While these tools offer potential, their adoption remains limited due to challenges such as steep learning curves and inadequate support. LLMs like GPT-3 have brought transformative possibilities to SLRs, but their hallucinations and lack of continuous learning capabilities remain significant issues. Retrieval-Augmented Generation (RAG) combines LLMs with external knowledge sources to enhance factual accuracy and mitigate hallucinations. The study proposesThis research introduces a novel approach to automating Systematic Literature Reviews (SLRs) using fine-tuned Large Language Models (LLMs). The study leverages the latest fine-tuning methodologies and open-sourced LLMs to automate the final stages of an SLR process, focusing on knowledge synthesis. The results maintain high factual accuracy in LLM responses and are validated through the replication of a PRISMA-conforming SLR. The research addresses the challenges of LLM hallucination and proposes mechanisms to track LLM responses to their sources, ensuring the reliability and integrity of the findings. The findings confirm the potential of fine-tuned LLMs in streamlining labor-intensive processes in literature reviews, advocating for updates to PRISMA reporting guidelines to incorporate AI-driven processes. The study broadens the application of AI-enhanced tools across various academic and research fields, setting a new standard for comprehensive and accurate literature reviews. LLM Fine-tuning for SLRs · SLR Automation · Retrieval-Augmented Generation for Research · Domain-Specific Model Training · Knowledge Synthesis AI · AI-Driven Research Synthesis · Literature Review Automation · Generative AI · AI-Enhanced Systematic Reviews · PRISMA and AI Integration Systematic Literature Reviews (SLRs) are crucial for academic research, integrating and synthesizing existing scholarly knowledge. However, traditional SLRs are manual and resource-intensive, leading to inefficiencies. The advent of AI systems like LLMs offers a transformative opportunity to automate information retrieval processes while maintaining factual fidelity. However, broad generalist pretraining of LLMs leads to domain-specific inaccuracies and hallucinations, which are critical issues for SLRs. This study proposes the creation of fine-tuned LLMs trained on specific academic papers to enhance domain-specific expertise and improve the accuracy of knowledge synthesis. The study contributes to the field of information retrieval by developing a methodical approach to converting selected academic papers into datasets for LLM fine-tuning. It demonstrates the effectiveness of the proposed framework through an empirical study that replicates a PRISMA-conforming SLR, serving as a gold standard for validation. The literature review examines the SLR process, focusing on the synthesis phase and the integration of LLMs and AI. It addresses the challenges of current SLR automation, the role of LLMs, and the need for factual accuracy. The review also explores the potential of fine-tuning domain-specific LLMs for SLR tasks. Recent advancements in AI, NLP, and machine learning have led to the development of automation tools for literature reviews. While these tools offer potential, their adoption remains limited due to challenges such as steep learning curves and inadequate support. LLMs like GPT-3 have brought transformative possibilities to SLRs, but their hallucinations and lack of continuous learning capabilities remain significant issues. Retrieval-Augmented Generation (RAG) combines LLMs with external knowledge sources to enhance factual accuracy and mitigate hallucinations. The study proposes

Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning

April 16, 2024 | Teo Susnjak, Peter Hwang, Napoleon H. Reyes, Andre L. C. Barczak, Timothy R. McIntosh, Surangika Ranathunga