29 Feb 2024 | Jonathan Yang*, Catherine Glossop†, Arjun Bhorkar‡, Dhruv Shah†, Quan Vuong‡, Chelsea Finn*, Dorsa Sadigh*, Sergey Levine†
This paper explores the effectiveness of cross-embodiment learning for robotic manipulation and navigation. The authors train a single goal-conditioned policy across 18 diverse datasets, including manipulation, navigation, and driving tasks, to control various robotic systems such as arms, drones, and mobile manipulators. The policy is tested on a mobile manipulator without any specific data for that embodiment, demonstrating zero-shot generalization. The results show that co-training with navigation data improves the performance of manipulation policies, and vice versa. The policy achieves a 20% higher success rate in manipulation tasks when trained on both navigation and manipulation data. Navigation data also helps manipulators generalize better by providing more informative embeddings for distance to the goal. The policy is also shown to generalize to new embodiments such as a quadrotor and a mobile manipulator. The authors propose a unified action coordinate system and a policy architecture that enables cross-embodiment learning. The experiments demonstrate that policies trained on diverse datasets can benefit from data across different embodiments, leading to improved performance in both navigation and manipulation tasks. The study highlights the potential of large-scale robotic policies trained on heterogeneous data to achieve better generalization and transfer across different robotic systems.This paper explores the effectiveness of cross-embodiment learning for robotic manipulation and navigation. The authors train a single goal-conditioned policy across 18 diverse datasets, including manipulation, navigation, and driving tasks, to control various robotic systems such as arms, drones, and mobile manipulators. The policy is tested on a mobile manipulator without any specific data for that embodiment, demonstrating zero-shot generalization. The results show that co-training with navigation data improves the performance of manipulation policies, and vice versa. The policy achieves a 20% higher success rate in manipulation tasks when trained on both navigation and manipulation data. Navigation data also helps manipulators generalize better by providing more informative embeddings for distance to the goal. The policy is also shown to generalize to new embodiments such as a quadrotor and a mobile manipulator. The authors propose a unified action coordinate system and a policy architecture that enables cross-embodiment learning. The experiments demonstrate that policies trained on diverse datasets can benefit from data across different embodiments, leading to improved performance in both navigation and manipulation tasks. The study highlights the potential of large-scale robotic policies trained on heterogeneous data to achieve better generalization and transfer across different robotic systems.