26 Jun 2018 | Aravind Rajeswaran1*, Vikash Kumar1,2*, Abhishek Gupta3, Giulia Vezzani1, John Schulman2, Emanuel Todorov1, Sergey Levine3
This paper addresses the challenge of controlling complex, multi-fingered hands in human-centric environments, which are versatile but difficult to manage due to their high dimensionality and numerous potential contacts. The authors propose a model-free deep reinforcement learning (DRL) approach to control such hands, demonstrating its effectiveness in simulated experiments with a 24-DoF hand. They show that with a small number of human demonstrations, the sample complexity can be significantly reduced, enabling learning with sample sizes equivalent to a few hours of robot experience. The use of demonstrations results in policies that exhibit natural movements and robustness to environmental variations. The paper also introduces a set of dexterous manipulation tasks, including object relocation, in-hand manipulation, tool use, and door opening, which are designed to capture the technical challenges of real-world manipulation. The authors compare their DRL methods with existing approaches, showing that their proposed method, DAPG (Demonstration Augmented Policy Gradient), outperforms other state-of-the-art methods in terms of sample efficiency and policy robustness. The results suggest that DRL, when combined with demonstrations, is a viable approach for real-world learning of complex dexterous manipulation skills.This paper addresses the challenge of controlling complex, multi-fingered hands in human-centric environments, which are versatile but difficult to manage due to their high dimensionality and numerous potential contacts. The authors propose a model-free deep reinforcement learning (DRL) approach to control such hands, demonstrating its effectiveness in simulated experiments with a 24-DoF hand. They show that with a small number of human demonstrations, the sample complexity can be significantly reduced, enabling learning with sample sizes equivalent to a few hours of robot experience. The use of demonstrations results in policies that exhibit natural movements and robustness to environmental variations. The paper also introduces a set of dexterous manipulation tasks, including object relocation, in-hand manipulation, tool use, and door opening, which are designed to capture the technical challenges of real-world manipulation. The authors compare their DRL methods with existing approaches, showing that their proposed method, DAPG (Demonstration Augmented Policy Gradient), outperforms other state-of-the-art methods in terms of sample efficiency and policy robustness. The results suggest that DRL, when combined with demonstrations, is a viable approach for real-world learning of complex dexterous manipulation skills.