[slides] SINDy-RL%3A Interpretable and Efficient Model-Based Reinforcement Learning

SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning Deep reinforcement learning (DRL) has shown promise in uncovering sophisticated control policies for complex environments, but it often requires a large number of training examples and can be computationally expensive. To address these limitations, this paper introduces SINDy-RL, a framework that combines sparse dictionary learning with DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy. SINDy-RL leverages sparse identification of nonlinear dynamics (SINDy) to learn low-dimensional, interpretable models of the environment dynamics, reward function, and control policy. The framework is evaluated on benchmark control environments and challenging fluid problems, demonstrating comparable performance to state-of-the-art DRL algorithms with significantly fewer interactions and an interpretable control policy orders of magnitude smaller than a deep neural network policy. Key contributions include: - A Dyna-style model-based reinforcement learning (MBRL) algorithm that fits an ensemble of SINDy models to approximate the environment's dynamics. - Learning a surrogate reward function when the reward is not directly measurable from observations. - Reducing the complexity of a neural network policy by learning a sparse, symbolic surrogate policy with comparable performance and smoother control. - Quantifying the uncertainty of models and providing insights into the quality of the learned models. SINDy-RL is evaluated on three environments: the dm_control swing-up environment, the gymnasium Swimmer-v4 environment, and the HydroGym Cylinder environment. The results show that SINDy-RL achieves significant sample efficiency gains, outperforms baseline DRL algorithms, and provides interpretable control policies.SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning Deep reinforcement learning (DRL) has shown promise in uncovering sophisticated control policies for complex environments, but it often requires a large number of training examples and can be computationally expensive. To address these limitations, this paper introduces SINDy-RL, a framework that combines sparse dictionary learning with DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy. SINDy-RL leverages sparse identification of nonlinear dynamics (SINDy) to learn low-dimensional, interpretable models of the environment dynamics, reward function, and control policy. The framework is evaluated on benchmark control environments and challenging fluid problems, demonstrating comparable performance to state-of-the-art DRL algorithms with significantly fewer interactions and an interpretable control policy orders of magnitude smaller than a deep neural network policy. Key contributions include: - A Dyna-style model-based reinforcement learning (MBRL) algorithm that fits an ensemble of SINDy models to approximate the environment's dynamics. - Learning a surrogate reward function when the reward is not directly measurable from observations. - Reducing the complexity of a neural network policy by learning a sparse, symbolic surrogate policy with comparable performance and smoother control. - Quantifying the uncertainty of models and providing insights into the quality of the learned models. SINDy-RL is evaluated on three environments: the dm_control swing-up environment, the gymnasium Swimmer-v4 environment, and the HydroGym Cylinder environment. The results show that SINDy-RL achieves significant sample efficiency gains, outperforms baseline DRL algorithms, and provides interpretable control policies.

SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning

14 Mar 2024 | Nicholas Zolman, Urban Fasel, J. Nathan Kutz, and Steven L. Brunton