RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation

RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation

22 Feb 2024 | Junting Chen*12, Yao Mu*13, Qiaojun Yu4, Tianming Wei1, Silang Wu5, Zhecheng Yuan5, Zhixuan Liang3, Chao Yang1, Kaipeng Zhang1, Wenqi Shao1, Yu Qiao1, Huazhe Xu6, Mingyu Ding16, Ping Luo113
RoboScript is a platform designed to generate deployable code for free-form manipulation tasks in both simulation and real-world robotics. The platform addresses the gap between high-level task planning and low-level control by providing a unified interface with both simulation and real robots, ensuring syntax compliance and simulation validation using Gazebo. It includes a comprehensive benchmark for evaluating the reasoning abilities of large language models (LLMs) in handling physical interactions and constraints. The benchmark assesses the performance of GPT-3.5, GPT-4, and Gemini in complex physical manipulation tasks, highlighting their differences in reasoning capabilities. RoboScript's pipeline covers task interpretation, object detection, pose estimation, grasp planning, and motion planning, enabling autonomous manipulation. The platform has been evaluated on multiple robot embodiments, including the Franka and UR5 arms, and various grippers, demonstrating its adaptability across different robotic platforms. The paper also presents an in-depth ablation study on the system modules, focusing on the impact of individual components on overall performance. Additionally, the benchmark evaluates the influence of object geometry and perception accuracy on real-world deployment, providing insights into the strengths and limitations of LLMs in robotic manipulation tasks.RoboScript is a platform designed to generate deployable code for free-form manipulation tasks in both simulation and real-world robotics. The platform addresses the gap between high-level task planning and low-level control by providing a unified interface with both simulation and real robots, ensuring syntax compliance and simulation validation using Gazebo. It includes a comprehensive benchmark for evaluating the reasoning abilities of large language models (LLMs) in handling physical interactions and constraints. The benchmark assesses the performance of GPT-3.5, GPT-4, and Gemini in complex physical manipulation tasks, highlighting their differences in reasoning capabilities. RoboScript's pipeline covers task interpretation, object detection, pose estimation, grasp planning, and motion planning, enabling autonomous manipulation. The platform has been evaluated on multiple robot embodiments, including the Franka and UR5 arms, and various grippers, demonstrating its adaptability across different robotic platforms. The paper also presents an in-depth ablation study on the system modules, focusing on the impact of individual components on overall performance. Additionally, the benchmark evaluates the influence of object geometry and perception accuracy on real-world deployment, providing insights into the strengths and limitations of LLMs in robotic manipulation tasks.
Reach us at info@study.space