[slides and audio] Constrained Markov Decision Processes

This report presents a unified approach to studying constrained Markov decision processes (CMDPs) with a countable state space and unbounded costs. The focus is on a single controller with multiple objectives, aiming to design a policy that minimizes one cost while satisfying inequality constraints on other costs. The objectives studied include both expected average cost and expected total cost (including discounted cost as a special case). Two frameworks are considered: one where costs are bounded below and another for contracting problems. The report characterizes the set of achievable expected occupation measures and performance vectors, reducing the original control dynamic problem to an infinite linear program (LP). A Lagrangian approach is used to obtain sensitivity analysis, leading to asymptotic results for the constrained control problem, such as convergence of values and policies in the time horizon and discount factor. Several state truncation algorithms are also presented to approximate the solution of the original control problem using finite LPs. The report covers various aspects of CMDPs, including different types of cost criteria, solution approaches, and the convex analytical approach using occupation measures. It also discusses the linear programming and Lagrangian approaches for CMDPs, providing detailed derivations and analyses.This report presents a unified approach to studying constrained Markov decision processes (CMDPs) with a countable state space and unbounded costs. The focus is on a single controller with multiple objectives, aiming to design a policy that minimizes one cost while satisfying inequality constraints on other costs. The objectives studied include both expected average cost and expected total cost (including discounted cost as a special case). Two frameworks are considered: one where costs are bounded below and another for contracting problems. The report characterizes the set of achievable expected occupation measures and performance vectors, reducing the original control dynamic problem to an infinite linear program (LP). A Lagrangian approach is used to obtain sensitivity analysis, leading to asymptotic results for the constrained control problem, such as convergence of values and policies in the time horizon and discount factor. Several state truncation algorithms are also presented to approximate the solution of the original control problem using finite LPs. The report covers various aspects of CMDPs, including different types of cost criteria, solution approaches, and the convex analytical approach using occupation measures. It also discusses the linear programming and Lagrangian approaches for CMDPs, providing detailed derivations and analyses.

Constrained Markov Decision Processes

1995 | Eitan Altman