Generative Adversarial Imitation Learning

Generative Adversarial Imitation Learning

10 Jun 2016 | Jonathan Ho, Stefano Ermon
This paper proposes a new framework for directly learning policies from expert demonstrations without interaction with the expert or reinforcement signals. The framework draws an analogy between imitation learning and generative adversarial networks (GANs), leading to a model-free imitation learning algorithm that outperforms existing methods in imitating complex behaviors in large, high-dimensional environments. The algorithm is based on a cost regularizer that encourages policies to match the expert's occupancy measure, similar to the Jensen-Shannon divergence. The algorithm is tested on various physics-based control tasks and shows significant performance improvements over baselines such as behavioral cloning, feature expectation matching, and game-theoretic apprenticeship learning. The method is model-free and can scale to large environments, making it suitable for complex tasks. The paper also discusses the theoretical foundations of the approach, including the relationship between imitation learning and GANs, and the use of convex optimization techniques to derive the algorithm. The results demonstrate that the proposed method achieves high performance in imitating expert behavior, even with limited data.This paper proposes a new framework for directly learning policies from expert demonstrations without interaction with the expert or reinforcement signals. The framework draws an analogy between imitation learning and generative adversarial networks (GANs), leading to a model-free imitation learning algorithm that outperforms existing methods in imitating complex behaviors in large, high-dimensional environments. The algorithm is based on a cost regularizer that encourages policies to match the expert's occupancy measure, similar to the Jensen-Shannon divergence. The algorithm is tested on various physics-based control tasks and shows significant performance improvements over baselines such as behavioral cloning, feature expectation matching, and game-theoretic apprenticeship learning. The method is model-free and can scale to large environments, making it suitable for complex tasks. The paper also discusses the theoretical foundations of the approach, including the relationship between imitation learning and GANs, and the use of convex optimization techniques to derive the algorithm. The results demonstrate that the proposed method achieves high performance in imitating expert behavior, even with limited data.
Reach us at info@study.space
[slides] Generative Adversarial Imitation Learning | StudySpace