[slides] Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration

This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC). The system integrates a GPT-4 language model with a YOLO-based perception algorithm and Dynamic Movement Primitives (DMP) to enable complex task execution. The GPT-4 model decomposes high-level language commands into sequences of motions, while the YOLO algorithm provides visual cues to aid in planning feasible motions. The DMP library stores and replays motion trajectories learned from human demonstrations, enhancing the robot's adaptability to complex environments. Real-world experiments using the Toyota Human Support Robot demonstrate the effectiveness of the proposed framework, achieving high success rates and feasibility in various tasks, including zero-shot and one-shot tasks. The results highlight the system's robustness in translating language commands into robot motions and integrating operator instructions to accomplish challenging tasks. Future research will focus on integrating LIDAR and tactile sensing technologies to further improve the robot's performance in real-world environments.This paper proposes a novel approach to enhance the performance of LLM-based autonomous manipulation through Human-Robot Collaboration (HRC). The system integrates a GPT-4 language model with a YOLO-based perception algorithm and Dynamic Movement Primitives (DMP) to enable complex task execution. The GPT-4 model decomposes high-level language commands into sequences of motions, while the YOLO algorithm provides visual cues to aid in planning feasible motions. The DMP library stores and replays motion trajectories learned from human demonstrations, enhancing the robot's adaptability to complex environments. Real-world experiments using the Toyota Human Support Robot demonstrate the effectiveness of the proposed framework, achieving high success rates and feasibility in various tasks, including zero-shot and one-shot tasks. The results highlight the system's robustness in translating language commands into robot motions and integrating operator instructions to accomplish challenging tasks. Future research will focus on integrating LIDAR and tactile sensing technologies to further improve the robot's performance in real-world environments.

Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration

Aug. 2024 | Haokun Liu, Yaonan Zhu*, Kenji Kato, Atsushi Tsukahara, Izumi Kondo, Tadayoshi Aoyama, and Yasuhisa Hasegawa