2024 | Yiming Meng, Ruikun Zhou, Amartya Mukherjee, Maxwell Fitzsimmons, Christopher Song, Jun Liu
This paper proposes two algorithms for model-based policy iterations to solve nonlinear optimal control problems with convergence guarantees. The first algorithm, ELM-PI, uses linear least squares and is efficient for low-dimensional problems. The second, PINN-PI, employs physics-informed neural networks and scales better for high-dimensional problems. Both algorithms outperform traditional methods like Galerkin methods. Theoretical analysis shows that both converge to viscosity solutions of the Hamilton-Jacobi-Bellman (HJB) equation. Formal verification techniques are used to ensure the stability of the resulting controllers. Numerical experiments demonstrate that ELM-PI is effective for low-dimensional problems, while PINN-PI excels in high-dimensional scenarios. The algorithms are shown to converge to the true optimal solutions under less restrictive assumptions. The study emphasizes the importance of formal verification for safety-critical applications.This paper proposes two algorithms for model-based policy iterations to solve nonlinear optimal control problems with convergence guarantees. The first algorithm, ELM-PI, uses linear least squares and is efficient for low-dimensional problems. The second, PINN-PI, employs physics-informed neural networks and scales better for high-dimensional problems. Both algorithms outperform traditional methods like Galerkin methods. Theoretical analysis shows that both converge to viscosity solutions of the Hamilton-Jacobi-Bellman (HJB) equation. Formal verification techniques are used to ensure the stability of the resulting controllers. Numerical experiments demonstrate that ELM-PI is effective for low-dimensional problems, while PINN-PI excels in high-dimensional scenarios. The algorithms are shown to converge to the true optimal solutions under less restrictive assumptions. The study emphasizes the importance of formal verification for safety-critical applications.