This systematic review explores the application of Generative Pre-Trained Transformer (GPT) and Large Language Models (LLMs) in research, with a focus on data augmentation. The study analyzes 412 scholarly works, selecting 77 contributions that address three key research questions: (1) GPT in generating research data, (2) GPT in data analysis, and (3) GPT in research design. The review highlights the central role of GPT in data augmentation, with 48 relevant studies, and extends to its proactive role in critical data analysis and research design. A comprehensive classification framework is developed, categorizing existing literature into six main categories and 14 sub-categories, providing insights into the diverse applications of GPT in research data.
The study examines how GPT can generate synthetic data, enhance existing datasets, and improve machine learning models through data augmentation. It also explores GPT's capabilities in natural language processing (NLP), such as feature extraction, text simplification, and classification. The review discusses domain-specific applications in healthcare, finance, and environmental science, as well as ethical and privacy concerns related to GPT use. Additionally, the study evaluates the performance of GPT in statistical data analysis, research design, and problem-solving, demonstrating its potential to enhance research efficiency and accuracy.
The review also addresses the limitations of GPT, including ethical issues, biases, hallucinations, and sycophantic behavior. It emphasizes the need for careful integration of GPT in research, ensuring ethical guidelines and methodological rigor. The study concludes that GPT offers significant potential for enhancing research data generation, analysis, and design, but its use must be accompanied by ethical considerations and careful scrutiny to mitigate risks. Future research should focus on refining GPT to reduce bias and hallucinations, and developing algorithms to address sycophantic behavior in generated content. The study provides a systematic framework for understanding and applying GPT in research, offering valuable insights for scholars seeking to integrate AI technologies into their work.This systematic review explores the application of Generative Pre-Trained Transformer (GPT) and Large Language Models (LLMs) in research, with a focus on data augmentation. The study analyzes 412 scholarly works, selecting 77 contributions that address three key research questions: (1) GPT in generating research data, (2) GPT in data analysis, and (3) GPT in research design. The review highlights the central role of GPT in data augmentation, with 48 relevant studies, and extends to its proactive role in critical data analysis and research design. A comprehensive classification framework is developed, categorizing existing literature into six main categories and 14 sub-categories, providing insights into the diverse applications of GPT in research data.
The study examines how GPT can generate synthetic data, enhance existing datasets, and improve machine learning models through data augmentation. It also explores GPT's capabilities in natural language processing (NLP), such as feature extraction, text simplification, and classification. The review discusses domain-specific applications in healthcare, finance, and environmental science, as well as ethical and privacy concerns related to GPT use. Additionally, the study evaluates the performance of GPT in statistical data analysis, research design, and problem-solving, demonstrating its potential to enhance research efficiency and accuracy.
The review also addresses the limitations of GPT, including ethical issues, biases, hallucinations, and sycophantic behavior. It emphasizes the need for careful integration of GPT in research, ensuring ethical guidelines and methodological rigor. The study concludes that GPT offers significant potential for enhancing research data generation, analysis, and design, but its use must be accompanied by ethical considerations and careful scrutiny to mitigate risks. Future research should focus on refining GPT to reduce bias and hallucinations, and developing algorithms to address sycophantic behavior in generated content. The study provides a systematic framework for understanding and applying GPT in research, offering valuable insights for scholars seeking to integrate AI technologies into their work.