Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

4 Apr 2024 | Jake Varley*1, Sumeet Singh*1, Deepali Jain*1, Krzysztof Choromanski*1, Andy Zeng1, Somnath Basu Roy Chowdhury2, Avinava Dubey3, Vikas Sindhwani1
This paper presents a modular embodied AI system designed to perform complex tasks using two arms based on natural language instructions. The system integrates state-of-the-art Large Language Models (LLMs) for task planning, Vision-Language Models (VLMs) for semantic perception, and Point Cloud Transformers for grasping. It emphasizes safety and modularity, incorporating real-time trajectory optimizers and compliant tracking controllers to enable human-robot interaction. The system demonstrates zero-shot performance on tasks such as bi-arm sorting, bottle opening, and trash disposal, achieving high success rates while handling various challenges like part-level localization, coordinated manipulation, and safety constraints. The modular design allows for easy debugging and the integration of learned policies to enhance robustness. The paper also discusses related work in bi-arm robotics, foundation models for robotics, and AI safety, highlighting the system's contributions to these areas.This paper presents a modular embodied AI system designed to perform complex tasks using two arms based on natural language instructions. The system integrates state-of-the-art Large Language Models (LLMs) for task planning, Vision-Language Models (VLMs) for semantic perception, and Point Cloud Transformers for grasping. It emphasizes safety and modularity, incorporating real-time trajectory optimizers and compliant tracking controllers to enable human-robot interaction. The system demonstrates zero-shot performance on tasks such as bi-arm sorting, bottle opening, and trash disposal, achieving high success rates while handling various challenges like part-level localization, coordinated manipulation, and safety constraints. The modular design allows for easy debugging and the integration of learned policies to enhance robustness. The paper also discusses related work in bi-arm robotics, foundation models for robotics, and AI safety, highlighting the system's contributions to these areas.
Reach us at info@study.space
[slides] Embodied AI with Two Arms%3A Zero-shot Learning%2C Safety and Modularity | StudySpace