This paper presents a novel character control framework that utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations in real-time. The core of the method is a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes historical motion data and user control signals as input to produce a range of potential future motions. To address the challenges of diversity, controllability, and computational efficiency, the authors incorporate several key algorithmic designs, including separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension. The method enables real-time generation of character animations in multiple styles with a single unified model, demonstrating superior performance over existing character controllers in terms of quality, diversity, and control alignment. The effectiveness of the method is evaluated using a diverse set of locomotion skills from a publicly available mocap dataset, showing that it produces high-quality and diverse animations while adhering to user inputs. The paper also includes ablation studies to validate the importance of each algorithmic design.This paper presents a novel character control framework that utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations in real-time. The core of the method is a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes historical motion data and user control signals as input to produce a range of potential future motions. To address the challenges of diversity, controllability, and computational efficiency, the authors incorporate several key algorithmic designs, including separate condition tokenization, classifier-free guidance on past motion, and heuristic future trajectory extension. The method enables real-time generation of character animations in multiple styles with a single unified model, demonstrating superior performance over existing character controllers in terms of quality, diversity, and control alignment. The effectiveness of the method is evaluated using a diverse set of locomotion skills from a publicly available mocap dataset, showing that it produces high-quality and diverse animations while adhering to user inputs. The paper also includes ablation studies to validate the importance of each algorithmic design.