Understanding Guided Cost Learning%3A Deep Inverse Optimal Control via Policy Optimization

The paper introduces a method called Guided Cost Learning (GCL) that combines inverse optimal control (IOC) with policy optimization to learn complex behaviors from demonstrations. GCL addresses two key challenges in IOC: the need for informative features and effective regularization to impose structure on the cost function, and the difficulty of learning the cost function under unknown dynamics for high-dimensional continuous systems. The method uses expressive, nonlinear function approximators like neural networks to represent the cost function, reducing the engineering burden and enabling learning of complex cost functions without manual feature engineering. To handle unknown dynamics and high-dimensional systems, GCL employs a cost learning algorithm based on policy optimization with local linear models. This approach updates the cost function in the inner loop of a policy search procedure, making it practical and efficient. The method is evaluated on simulated tasks and real-world robotic manipulation problems, demonstrating significant improvements over prior methods in terms of task complexity and sample efficiency.The paper introduces a method called Guided Cost Learning (GCL) that combines inverse optimal control (IOC) with policy optimization to learn complex behaviors from demonstrations. GCL addresses two key challenges in IOC: the need for informative features and effective regularization to impose structure on the cost function, and the difficulty of learning the cost function under unknown dynamics for high-dimensional continuous systems. The method uses expressive, nonlinear function approximators like neural networks to represent the cost function, reducing the engineering burden and enabling learning of complex cost functions without manual feature engineering. To handle unknown dynamics and high-dimensional systems, GCL employs a cost learning algorithm based on policy optimization with local linear models. This approach updates the cost function in the inner loop of a policy search procedure, making it practical and efficient. The method is evaluated on simulated tasks and real-world robotic manipulation problems, demonstrating significant improvements over prior methods in terms of task complexity and sample efficiency.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

27 May 2016 | Chelsea Finn, Sergey Levine, Pieter Abbeel