Implications of ChatGPT for Data Science Education

Implications of ChatGPT for Data Science Education

March 20–23, 2024 | Yiyin Shen, Xinyi Ai, Adalbert Gerald Soosai Raj, Rogers Jeffrey Leo John, Meenakshi Syamkumar
The paper "Implications of ChatGPT for Data Science Education" by Yiyin Shen, Xinyi Ai, Adalbert Gerald Soosai Raj, Rogers Jeffrey Leo John, and Meenakshi Syamkumar explores the performance of ChatGPT on Data Science (DS) assignments from three courses at two institutions. The study aims to assess whether ChatGPT can generate correct code for DS problems and to identify prompt engineering techniques that improve its performance. **Key Findings:** 1. **Performance of Raw Prompt Generation:** - ChatGPT performs well on assignments with detailed dataset descriptions and progressive question prompts. - The correctness rate decreases as the course difficulty increases, with course 1 having an average correctness rate of 75.66%, course 2 of 52.65%, and course 3 of 22.9%. 2. **Impact of Prompt Engineering:** - Prompt engineering techniques such as breaking down steps into multiple prompts, providing additional dataset context, adding algorithmic details, adding specific instructions, and removing extraneous information significantly improve ChatGPT’s performance. - The average correctness rate for course 2 and course 3 with engineered prompts is 98.3% and 86.88%, respectively, with no failed generation cases. **Discussion:** - The study highlights the importance of prompt engineering in enhancing ChatGPT’s ability to solve DS problems. - It suggests that educators should consider re-writing assignments to decouple dataset descriptions and provide more context, especially for advanced courses. - The paper also emphasizes the need for students to develop skills in code review and debugging, as ChatGPT’s responses often contain logical errors that require human intervention. **Conclusion:** The paper concludes that while ChatGPT can be a valuable tool for generating code, it is particularly effective for self-contained assignments with explicit dataset descriptions. Prompt engineering techniques are crucial for improving ChatGPT’s performance on more complex DS problems, and educators should incorporate these techniques into their curriculum design to enhance students’ problem-solving skills.The paper "Implications of ChatGPT for Data Science Education" by Yiyin Shen, Xinyi Ai, Adalbert Gerald Soosai Raj, Rogers Jeffrey Leo John, and Meenakshi Syamkumar explores the performance of ChatGPT on Data Science (DS) assignments from three courses at two institutions. The study aims to assess whether ChatGPT can generate correct code for DS problems and to identify prompt engineering techniques that improve its performance. **Key Findings:** 1. **Performance of Raw Prompt Generation:** - ChatGPT performs well on assignments with detailed dataset descriptions and progressive question prompts. - The correctness rate decreases as the course difficulty increases, with course 1 having an average correctness rate of 75.66%, course 2 of 52.65%, and course 3 of 22.9%. 2. **Impact of Prompt Engineering:** - Prompt engineering techniques such as breaking down steps into multiple prompts, providing additional dataset context, adding algorithmic details, adding specific instructions, and removing extraneous information significantly improve ChatGPT’s performance. - The average correctness rate for course 2 and course 3 with engineered prompts is 98.3% and 86.88%, respectively, with no failed generation cases. **Discussion:** - The study highlights the importance of prompt engineering in enhancing ChatGPT’s ability to solve DS problems. - It suggests that educators should consider re-writing assignments to decouple dataset descriptions and provide more context, especially for advanced courses. - The paper also emphasizes the need for students to develop skills in code review and debugging, as ChatGPT’s responses often contain logical errors that require human intervention. **Conclusion:** The paper concludes that while ChatGPT can be a valuable tool for generating code, it is particularly effective for self-contained assignments with explicit dataset descriptions. Prompt engineering techniques are crucial for improving ChatGPT’s performance on more complex DS problems, and educators should incorporate these techniques into their curriculum design to enhance students’ problem-solving skills.
Reach us at info@study.space
Understanding Implications of ChatGPT for Data Science Education