2024 | Chenyu Hou, Gaoxia Zhu, Juan Zheng, Lishan Zhang, Xiaoshan Huang, Tianlong Zhong, Shan Li, Hanxiang Du, Chin Lee Ker
This study explores the effectiveness of prompt engineering and fine-tuning approaches in GPT models for context-dependent and context-independent deductive coding in social annotation. The research aims to enhance the accuracy and reliability of coding processes, particularly in social annotation, where students collaboratively comment on digital materials. Context-dependent dimensions, such as theorizing, integration, and reflection, require a contextualized understanding of the comments in relation to the reading materials and previous comments. Context-independent dimensions, such as appraisal, questioning, social, curiosity, and surprise, rely more on the content of the comments themselves.
The study found that prompt engineering can achieve fair to substantial agreement with expert-labeled data across various coding dimensions, especially for context-independent dimensions. Fine-tuning GPT models using 102 expert-labeled examples further improved the accuracy, particularly for context-independent dimensions, and elevated the inter-rater reliability of context-dependent categories to moderate levels. The results suggest that GPT models can effectively assist in coding social annotation data, reducing human labor and time, especially with large unstructured datasets.
The study also highlights the importance of crafting effective prompts and the need for further research to optimize prompt-based and fine-tuning approaches. The findings have significant implications for educational practices, such as providing immediate feedback and scaffolding in social annotation platforms, and for future research on automated content analysis in qualitative research.This study explores the effectiveness of prompt engineering and fine-tuning approaches in GPT models for context-dependent and context-independent deductive coding in social annotation. The research aims to enhance the accuracy and reliability of coding processes, particularly in social annotation, where students collaboratively comment on digital materials. Context-dependent dimensions, such as theorizing, integration, and reflection, require a contextualized understanding of the comments in relation to the reading materials and previous comments. Context-independent dimensions, such as appraisal, questioning, social, curiosity, and surprise, rely more on the content of the comments themselves.
The study found that prompt engineering can achieve fair to substantial agreement with expert-labeled data across various coding dimensions, especially for context-independent dimensions. Fine-tuning GPT models using 102 expert-labeled examples further improved the accuracy, particularly for context-independent dimensions, and elevated the inter-rater reliability of context-dependent categories to moderate levels. The results suggest that GPT models can effectively assist in coding social annotation data, reducing human labor and time, especially with large unstructured datasets.
The study also highlights the importance of crafting effective prompts and the need for further research to optimize prompt-based and fine-tuning approaches. The findings have significant implications for educational practices, such as providing immediate feedback and scaffolding in social annotation platforms, and for future research on automated content analysis in qualitative research.