1 Aug 2024 | Majeed Kazemitabaar, Jack Williams, Ian Drosos, Tovi Grossman, Austin Z. Henley, Carina Negreanu, Advait Sarkar
This paper addresses the challenges of steering and verifying AI-generated results in data analysis tasks, particularly when using conversational AI tools like ChatGPT Data Analysis. The authors conducted a formative study with 15 participants to identify the limitations of these tools, focusing on steering and verification. They developed two new systems, the PHASEWISE and STEPWISE, which aim to improve user control and interaction with AI-assisted data analysis. The PHASEWISE system decomposes the task into three editable phases (assumptions, planning, and code), while the STEPWISE system breaks down the task into step-by-step subgoals with editable assumptions and corresponding code. A controlled within-subjects experiment with 18 participants compared these systems against a conversational baseline. Results showed that users reported significantly greater control and found intervention, correction, and verification easier with the PHASEWISE and STEPWISE systems compared to the baseline. The study provides insights into design guidelines and trade-offs for AI-assisted data analysis tools, emphasizing the importance of structured task decomposition and interactive assumptions for enhancing user experience and efficiency.This paper addresses the challenges of steering and verifying AI-generated results in data analysis tasks, particularly when using conversational AI tools like ChatGPT Data Analysis. The authors conducted a formative study with 15 participants to identify the limitations of these tools, focusing on steering and verification. They developed two new systems, the PHASEWISE and STEPWISE, which aim to improve user control and interaction with AI-assisted data analysis. The PHASEWISE system decomposes the task into three editable phases (assumptions, planning, and code), while the STEPWISE system breaks down the task into step-by-step subgoals with editable assumptions and corresponding code. A controlled within-subjects experiment with 18 participants compared these systems against a conversational baseline. Results showed that users reported significantly greater control and found intervention, correction, and verification easier with the PHASEWISE and STEPWISE systems compared to the baseline. The study provides insights into design guidelines and trade-offs for AI-assisted data analysis tools, emphasizing the importance of structured task decomposition and interactive assumptions for enhancing user experience and efficiency.