MOSAIC: A Modular System for Assistive and Interactive Cooking

MOSAIC: A Modular System for Assistive and Interactive Cooking

29 Feb 2024 | Huaxiaoyue Wang*, Kushal Kedia*, Juntao Ren*, Rahma Abdullah, Atiksh Bhadwaj, Angela Chao, Kelly Y Chen, Nathaniel Chin, Prithwish Dan, Xinyi Fan, Gonzalo Gonzalez-Pumariega, Aditya Kompella, Maximus Adrian Pace, Yash Sharma, Xiangwan Sun, Neha Sunkara, Sanjiban Choudhury
MOSAIC is a modular system for home robots to perform complex collaborative tasks, such as cooking with everyday users. It enables robots to interact with humans via natural language, coordinate multiple robots, and manage an open vocabulary of everyday objects. The system is built on modularity, leveraging multiple large-scale pre-trained models for general tasks like language and image recognition, while using streamlined modules for task-specific control. MOSAIC is evaluated on 60 end-to-end trials where two robots collaborate with a human user to cook a combination of 6 recipes. The system completes 68.3% (41/60) of the collaborative cooking trials with a subtask completion rate of 91.6%. The system is designed to interact with users via natural language, perform a range of skills that require manipulating everyday objects, and collaborate seamlessly with humans. The system includes an interactive task planner, visuomotor skills, and human motion forecasting. The task planner uses a behavior tree to embed large language models for task planning, reducing the complexity of reasoning required from the LLM and the overall error rate. Visuomotor skills use a pre-trained vision-language model for object identification and a policy learned via RL in simulation for action selection. Human motion forecasting uses large-scale human motion data to train a forecasting model, enabling robots to plan safe and legible actions in close proximity to humans. The system is evaluated on 60 end-to-end trials, with 180 episodes of visuomotor picking, 60 episodes of human motion forecasting, and 46 online user evaluations of the task planner. The system is able to efficiently collaborate with humans. The paper discusses the limitations of the current system and exciting open challenges in this domain. The project's website is at https://portal-cornell.github.io/MOSAIC/.MOSAIC is a modular system for home robots to perform complex collaborative tasks, such as cooking with everyday users. It enables robots to interact with humans via natural language, coordinate multiple robots, and manage an open vocabulary of everyday objects. The system is built on modularity, leveraging multiple large-scale pre-trained models for general tasks like language and image recognition, while using streamlined modules for task-specific control. MOSAIC is evaluated on 60 end-to-end trials where two robots collaborate with a human user to cook a combination of 6 recipes. The system completes 68.3% (41/60) of the collaborative cooking trials with a subtask completion rate of 91.6%. The system is designed to interact with users via natural language, perform a range of skills that require manipulating everyday objects, and collaborate seamlessly with humans. The system includes an interactive task planner, visuomotor skills, and human motion forecasting. The task planner uses a behavior tree to embed large language models for task planning, reducing the complexity of reasoning required from the LLM and the overall error rate. Visuomotor skills use a pre-trained vision-language model for object identification and a policy learned via RL in simulation for action selection. Human motion forecasting uses large-scale human motion data to train a forecasting model, enabling robots to plan safe and legible actions in close proximity to humans. The system is evaluated on 60 end-to-end trials, with 180 episodes of visuomotor picking, 60 episodes of human motion forecasting, and 46 online user evaluations of the task planner. The system is able to efficiently collaborate with humans. The paper discusses the limitations of the current system and exciting open challenges in this domain. The project's website is at https://portal-cornell.github.io/MOSAIC/.
Reach us at info@study.space
Understanding MOSAIC%3A A Modular System for Assistive and Interactive Cooking