Understanding UMI on Legs%3A Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

UMI-on-Legs is a new framework that combines real-world and simulation data for quadruped manipulation systems. It enables the collection of task-centric data in the real world using a hand-held gripper (UMI), providing a cost-effective way to demonstrate manipulation skills without a robot. Simultaneously, it scales robot-centric data in simulation by training a whole-body controller for task-tracking without task simulation setups. The interface between these two policies is end-effector trajectories in the task frame, inferred by the manipulation policy and passed to the whole-body controller for tracking. The framework allows the porting of existing “table-top” manipulation policies to mobile manipulation while enhancing mobility and power from the quadruped’s legs. UMI-on-Legs was evaluated on prehensile, non-prehensile, and dynamic manipulation tasks, achieving over 70% success rate on all tasks. Additionally, a pre-trained manipulation policy from prior work was successfully deployed on a quadruped system, demonstrating zero-shot cross-embodiment deployment. The framework provides a scalable path towards learning expressive manipulation skills on dynamic robot embodiments. The system includes a real-time odometry solution based on Apple's ARKit, enabling robust and accessible in-the-wild task-space tracking. The framework also includes a diffusion-based manipulation policy and a whole-body controller trained in simulation. The system was tested on various tasks, including dynamic tossing, which required dynamic whole-body coordination. The controller was able to handle unexpected dynamics and object interactions, achieving high success rates. The framework also demonstrated the feasibility of deploying pre-trained manipulation policies on different robot embodiments. The system was evaluated on several tasks, including cup rearrangement, and achieved high success rates. The framework addresses challenges in cross-embodiment manipulation, including the need for scalable and expressive manipulation policies. The system also includes a real-time odometry solution, enabling robust and accurate tracking in the wild. The framework demonstrates the potential for scalable and expressive manipulation skills on dynamic robot embodiments. The system was tested on various tasks, including dynamic tossing, and achieved high success rates. The framework also addresses challenges in cross-embodiment manipulation, including the need for scalable and expressive manipulation policies. The system includes a real-time odometry solution, enabling robust and accurate tracking in the wild. The framework demonstrates the potential for scalable and expressive manipulation skills on dynamic robot embodiments.UMI-on-Legs is a new framework that combines real-world and simulation data for quadruped manipulation systems. It enables the collection of task-centric data in the real world using a hand-held gripper (UMI), providing a cost-effective way to demonstrate manipulation skills without a robot. Simultaneously, it scales robot-centric data in simulation by training a whole-body controller for task-tracking without task simulation setups. The interface between these two policies is end-effector trajectories in the task frame, inferred by the manipulation policy and passed to the whole-body controller for tracking. The framework allows the porting of existing “table-top” manipulation policies to mobile manipulation while enhancing mobility and power from the quadruped’s legs. UMI-on-Legs was evaluated on prehensile, non-prehensile, and dynamic manipulation tasks, achieving over 70% success rate on all tasks. Additionally, a pre-trained manipulation policy from prior work was successfully deployed on a quadruped system, demonstrating zero-shot cross-embodiment deployment. The framework provides a scalable path towards learning expressive manipulation skills on dynamic robot embodiments. The system includes a real-time odometry solution based on Apple's ARKit, enabling robust and accessible in-the-wild task-space tracking. The framework also includes a diffusion-based manipulation policy and a whole-body controller trained in simulation. The system was tested on various tasks, including dynamic tossing, which required dynamic whole-body coordination. The controller was able to handle unexpected dynamics and object interactions, achieving high success rates. The framework also demonstrated the feasibility of deploying pre-trained manipulation policies on different robot embodiments. The system was evaluated on several tasks, including cup rearrangement, and achieved high success rates. The framework addresses challenges in cross-embodiment manipulation, including the need for scalable and expressive manipulation policies. The system also includes a real-time odometry solution, enabling robust and accurate tracking in the wild. The framework demonstrates the potential for scalable and expressive manipulation skills on dynamic robot embodiments. The system was tested on various tasks, including dynamic tossing, and achieved high success rates. The framework also addresses challenges in cross-embodiment manipulation, including the need for scalable and expressive manipulation policies. The system includes a real-time odometry solution, enabling robust and accurate tracking in the wild. The framework demonstrates the potential for scalable and expressive manipulation skills on dynamic robot embodiments.

UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

14 Jul 2024 | Huy Ha, Yihuai Gao, Zipeng Fu, Jie Tan, and Shuran Song

UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

14 Jul 2024 | Huy Ha*, Yihuai Gao*, Zipeng Fu, Jie Tan, and Shuran Song

14 Jul 2024 | Huy Ha, Yihuai Gao, Zipeng Fu, Jie Tan, and Shuran Song