[slides] Optimal dynamic treatment regimes

This paper introduces a method for estimating optimal dynamic treatment regimes, which are decision rules that determine how treatment levels should vary over time based on an individual's changing status. The goal is to use experimental or observational data to estimate decision rules that result in the maximal mean response. The method makes smooth parametric assumptions only on quantities directly relevant to estimating the optimal rules. The paper uses the potential outcomes model to specify the objective and state assumptions. It discusses the use of dynamic programming to find decision rules that maximize the mean response. The paper also introduces regret functions, which measure the difference between the optimal benefit-to-go and the current benefit-to-go. The regret functions are parameterized and estimated to derive optimal decision rules. The paper provides a method for estimating the regret functions using least squares and discusses the advantages of modeling the regrets. The paper concludes with a simulation study that illustrates the method proposed.This paper introduces a method for estimating optimal dynamic treatment regimes, which are decision rules that determine how treatment levels should vary over time based on an individual's changing status. The goal is to use experimental or observational data to estimate decision rules that result in the maximal mean response. The method makes smooth parametric assumptions only on quantities directly relevant to estimating the optimal rules. The paper uses the potential outcomes model to specify the objective and state assumptions. It discusses the use of dynamic programming to find decision rules that maximize the mean response. The paper also introduces regret functions, which measure the difference between the optimal benefit-to-go and the current benefit-to-go. The regret functions are parameterized and estimated to derive optimal decision rules. The paper provides a method for estimating the regret functions using least squares and discusses the advantages of modeling the regrets. The paper concludes with a simulation study that illustrates the method proposed.

Optimal dynamic treatment regimes

2003 | S. A. Murphy