Understanding Reinforcement Learning and Dynamic Programming Using Function Approximators

The preface of the book "Reinforcement Learning and Dynamic Programming using Function Approximators" highlights the importance of control systems in various fields, including technical systems, economics, medicine, social sciences, and artificial intelligence. The book focuses on the need to influence or modify the behavior of dynamic systems to achieve prespecified goals, which is addressed through the use of numerical performance indices and control policies. It reviews the history of optimal control, tracing it back to the 1940s with the work of Pontryagin and Bellman, and discusses the development of dynamic programming (DP) and reinforcement learning (RL) as alternative approaches to solve optimal control problems when a system model is available or not. The core challenge in DP and RL is the representation of solutions for problems with large discrete state-action spaces or continuous spaces, which requires the use of function approximators. Recent advancements in approximation-based methods have made it possible to scale up these techniques to realistic problems. The book provides an in-depth treatment of reinforcement learning and dynamic programming methods using function approximators, starting with an introduction to classical DP and RL, followed by a review of state-of-the-art approaches. Theoretical guarantees and numerical examples are provided to illustrate the properties of individual methods. The remaining chapters focus on detailed presentations of representative algorithms from three major classes: value iteration, policy iteration, and policy search. The book is designed for researchers, teachers, graduate students, and practitioners in fields such as optimal and adaptive control, machine learning, artificial intelligence, and control systems. It can be read in several ways, depending on the reader's background, and includes supplementary information and computer code available on a dedicated website.The preface of the book "Reinforcement Learning and Dynamic Programming using Function Approximators" highlights the importance of control systems in various fields, including technical systems, economics, medicine, social sciences, and artificial intelligence. The book focuses on the need to influence or modify the behavior of dynamic systems to achieve prespecified goals, which is addressed through the use of numerical performance indices and control policies. It reviews the history of optimal control, tracing it back to the 1940s with the work of Pontryagin and Bellman, and discusses the development of dynamic programming (DP) and reinforcement learning (RL) as alternative approaches to solve optimal control problems when a system model is available or not. The core challenge in DP and RL is the representation of solutions for problems with large discrete state-action spaces or continuous spaces, which requires the use of function approximators. Recent advancements in approximation-based methods have made it possible to scale up these techniques to realistic problems. The book provides an in-depth treatment of reinforcement learning and dynamic programming methods using function approximators, starting with an introduction to classical DP and RL, followed by a review of state-of-the-art approaches. Theoretical guarantees and numerical examples are provided to illustrate the properties of individual methods. The remaining chapters focus on detailed presentations of representative algorithms from three major classes: value iteration, policy iteration, and policy search. The book is designed for researchers, teachers, graduate students, and practitioners in fields such as optimal and adaptive control, machine learning, artificial intelligence, and control systems. It can be read in several ways, depending on the reader's background, and includes supplementary information and computer code available on a dedicated website.

Reinforcement learning and dynamic programming using function approximators

November 2009 | Lucian Buşoniu, Robert Babuška, Bart De Schutter, and Damien Ernst