This chapter discusses noncontractive total cost problems in dynamic programming and optimal control, focusing on models that lack the contractive structure seen in discounted problems. It covers positive and negative cost models, deterministic optimal control, stochastic shortest path problems, and risk-sensitive models. The chapter provides a unified treatment of these models, emphasizing their unique properties and challenges. It includes new material that expands on previous results, offering a more comprehensive analysis of these noncontractive problems.
The chapter begins with an overview of positive and negative cost models, discussing their characteristics and the implications of unbounded costs. It then presents Bellman's equation and its role in determining optimal cost functions. The chapter also addresses optimality conditions, showing how stationary policies can be characterized and how value iteration (VI) and policy iteration (PI) algorithms can be applied.
The analysis of these models is based on the fundamental monotonicity property of dynamic programming, which is crucial for deriving results without relying on contraction assumptions. The chapter discusses the uniqueness of solutions to Bellman's equation and the conditions under which VI and PI converge to the optimal cost function. It also explores the computational methods used to solve these problems, including the convergence of VI under different assumptions and the role of compactness in ensuring convergence.
The chapter further examines the behavior of VI and PI in various scenarios, highlighting the challenges posed by noncontractive models and the importance of additional conditions for reliable convergence. It also discusses the implications of these results for different types of problems, including deterministic shortest path problems, inventory control, and continuous-time models. The chapter concludes with a discussion of the broader implications of these findings for the analysis and solution of noncontractive total cost problems in dynamic programming and optimal control.This chapter discusses noncontractive total cost problems in dynamic programming and optimal control, focusing on models that lack the contractive structure seen in discounted problems. It covers positive and negative cost models, deterministic optimal control, stochastic shortest path problems, and risk-sensitive models. The chapter provides a unified treatment of these models, emphasizing their unique properties and challenges. It includes new material that expands on previous results, offering a more comprehensive analysis of these noncontractive problems.
The chapter begins with an overview of positive and negative cost models, discussing their characteristics and the implications of unbounded costs. It then presents Bellman's equation and its role in determining optimal cost functions. The chapter also addresses optimality conditions, showing how stationary policies can be characterized and how value iteration (VI) and policy iteration (PI) algorithms can be applied.
The analysis of these models is based on the fundamental monotonicity property of dynamic programming, which is crucial for deriving results without relying on contraction assumptions. The chapter discusses the uniqueness of solutions to Bellman's equation and the conditions under which VI and PI converge to the optimal cost function. It also explores the computational methods used to solve these problems, including the convergence of VI under different assumptions and the role of compactness in ensuring convergence.
The chapter further examines the behavior of VI and PI in various scenarios, highlighting the challenges posed by noncontractive models and the importance of additional conditions for reliable convergence. It also discusses the implications of these results for different types of problems, including deterministic shortest path problems, inventory control, and continuous-time models. The chapter concludes with a discussion of the broader implications of these findings for the analysis and solution of noncontractive total cost problems in dynamic programming and optimal control.