[slides] Alchemist%3A LLM-Aided End-User Development of Robot Applications

**Alchemist: LLM-Aided End-User Development of Robot Applications** **Authors:** Ulas Berk Karli, Juo-Tung Chen, Victor Nikhil Antony, and Chien-Ming Huang **Abstract:** Large Language Models (LLMs) have the potential to revolutionize end-user robot programming by shifting from traditional logic-based programming to an iterative, collaborative process where users specify desired outcomes while LLMs generate detailed specifications. The *Alchemist* system leverages LLMs to enable end-users to create, test, and run robot programs using natural language inputs, aiming to reduce the required knowledge for developing robot applications. The paper presents a detailed examination of the system design and an exploratory study involving true end-users to assess capabilities, usability, and limitations. Through the design, development, and evaluation of *Alchemist*, the authors derive lessons learned from using LLMs in robot programming, highlighting their potential as the next frontier for democratizing end-user development of robot applications. **Key Contributions:** 1. An open-source, end-to-end system that utilizes LLMs to enable a collaborative and intuitive robot programming experience for end-users. 2. An exploratory study to test and understand system capabilities and usability. 3. A set of lessons learned to inform the design and development of future LLM-powered robot programming systems. **System Overview:** *Alchemist* is designed to be robot-platform and LLM agnostic, supporting various settings and technical advancements. It integrates RViz for robot visualization, a chat-box for interacting with the LLM, and a terminal to run generated code. The system aims to facilitate programming with natural language, enable end-to-end robot development workflows, support varied programming proficiencies, visualize robot world and actions, and ensure system modularity. **Exploratory Study:** An exploratory study was conducted to gauge the usability of *Alchemist* and understand its limitations. The study involved 10 participants, including 5 novices and 5 experts in robotics. The results showed that both groups had similar task completion times, with novices tending to debug their programs by prompting the LLM further rather than using the editor. Novice users also preferred step-by-step instructions over general functions. Overall, participants found the collaborative programming paradigm promising, especially in specialized domains like life sciences research laboratories. **Lessons Learned:** 1. **LLM-Generated Code Reliability:** Enhancing LLM-generated code reliability through code verification and effective prompting is critical. 2. **Effective LLM Prompting:** Effective LLM prompting requires end-user training and dynamic context-dependent prompt enhancement. 3. **End-User Aversion to Direct Coding:** Introducing abstractions to minimize code complexities while retaining programmatic expressiveness can enhance user confidence in programming. **Limitations and Future Work:** The study had limitations, including a small sample size and the need for more rigorous validation methods. Future work should explore deploying *Al**Alchemist: LLM-Aided End-User Development of Robot Applications** **Authors:** Ulas Berk Karli, Juo-Tung Chen, Victor Nikhil Antony, and Chien-Ming Huang **Abstract:** Large Language Models (LLMs) have the potential to revolutionize end-user robot programming by shifting from traditional logic-based programming to an iterative, collaborative process where users specify desired outcomes while LLMs generate detailed specifications. The *Alchemist* system leverages LLMs to enable end-users to create, test, and run robot programs using natural language inputs, aiming to reduce the required knowledge for developing robot applications. The paper presents a detailed examination of the system design and an exploratory study involving true end-users to assess capabilities, usability, and limitations. Through the design, development, and evaluation of *Alchemist*, the authors derive lessons learned from using LLMs in robot programming, highlighting their potential as the next frontier for democratizing end-user development of robot applications. **Key Contributions:** 1. An open-source, end-to-end system that utilizes LLMs to enable a collaborative and intuitive robot programming experience for end-users. 2. An exploratory study to test and understand system capabilities and usability. 3. A set of lessons learned to inform the design and development of future LLM-powered robot programming systems. **System Overview:** *Alchemist* is designed to be robot-platform and LLM agnostic, supporting various settings and technical advancements. It integrates RViz for robot visualization, a chat-box for interacting with the LLM, and a terminal to run generated code. The system aims to facilitate programming with natural language, enable end-to-end robot development workflows, support varied programming proficiencies, visualize robot world and actions, and ensure system modularity. **Exploratory Study:** An exploratory study was conducted to gauge the usability of *Alchemist* and understand its limitations. The study involved 10 participants, including 5 novices and 5 experts in robotics. The results showed that both groups had similar task completion times, with novices tending to debug their programs by prompting the LLM further rather than using the editor. Novice users also preferred step-by-step instructions over general functions. Overall, participants found the collaborative programming paradigm promising, especially in specialized domains like life sciences research laboratories. **Lessons Learned:** 1. **LLM-Generated Code Reliability:** Enhancing LLM-generated code reliability through code verification and effective prompting is critical. 2. **Effective LLM Prompting:** Effective LLM prompting requires end-user training and dynamic context-dependent prompt enhancement. 3. **End-User Aversion to Direct Coding:** Introducing abstractions to minimize code complexities while retaining programmatic expressiveness can enhance user confidence in programming. **Limitations and Future Work:** The study had limitations, including a small sample size and the need for more rigorous validation methods. Future work should explore deploying *Al

Alchemist: LLM-Aided End-User Development of Robot Applications

March 11–14, 2024, Boulder, CO, USA | Ulas Berk Karli, Juo-Tung Chen, Victor Nikhil Antony, Chien-Ming Huang