[slides and audio] Learning dexterous in-hand manipulation

This paper presents a method for learning dexterous in-hand manipulation using reinforcement learning (RL) on a physical Shadow Dexterous Hand. The training is conducted in a simulated environment, where various physical properties and object appearances are randomized to enhance generalization. The learned policies are then transferred to the real robot, demonstrating unprecedented levels of dexterity and natural behaviors such as finger gaiting, multi-finger coordination, and controlled use of gravity. The system includes a vision-based pose estimator trained separately to predict object poses from images. Despite being trained entirely in simulation, the policies perform well on the physical robot, attributed to extensive randomizations, memory-augmented policies, and large-scale distributed training. The paper also discusses the importance of randomizations and memory in policy performance and evaluates the sample complexity and scale of the training process. The results show that the learned policies can successfully transfer to the real robot, achieving high levels of dexterity in in-hand manipulation tasks.This paper presents a method for learning dexterous in-hand manipulation using reinforcement learning (RL) on a physical Shadow Dexterous Hand. The training is conducted in a simulated environment, where various physical properties and object appearances are randomized to enhance generalization. The learned policies are then transferred to the real robot, demonstrating unprecedented levels of dexterity and natural behaviors such as finger gaiting, multi-finger coordination, and controlled use of gravity. The system includes a vision-based pose estimator trained separately to predict object poses from images. Despite being trained entirely in simulation, the policies perform well on the physical robot, attributed to extensive randomizations, memory-augmented policies, and large-scale distributed training. The paper also discusses the importance of randomizations and memory in policy performance and evaluates the sample complexity and scale of the training process. The results show that the learned policies can successfully transfer to the real robot, achieving high levels of dexterity in in-hand manipulation tasks.

Learning Dexterous In-Hand Manipulation

18 Jan 2019 | Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafał Jóźefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, Wojciech Zaremba