This survey provides a comprehensive overview of controllable generation using text-to-image diffusion models, covering both theoretical foundations and practical applications. Diffusion models, which operate on the principle of reverse diffusion, have significantly outperformed traditional frameworks like GANs in image generation. However, relying solely on text for conditioning these models is insufficient for diverse applications, prompting research into integrating novel conditions beyond text. The survey begins with an introduction to denoising diffusion probabilistic models (DDPMs) and widely used text-to-image diffusion models. It then explores the mechanisms for introducing novel conditions into the denoising process, categorizing controllable generation into three sub-tasks: generation with specific conditions, generation with multiple conditions, and universal controllable generation. The survey also discusses various methods for controlling text-to-image diffusion models, including conditional score prediction and condition-guided score estimation. It highlights the importance of integrating novel conditions into the generation process to meet diverse user needs. The survey reviews existing approaches for controlling text-to-image diffusion models, emphasizing the role of conditional score prediction and condition-guided score estimation in enabling precise control. It also explores the applications of these methods in various generative tasks, demonstrating their emergence as a fundamental aspect in the AIGC era. The survey concludes with a detailed analysis of different methods for controlling text-to-image diffusion models, including model-based, tuning-based, and training-free approaches, and their effectiveness in achieving controllable generation.This survey provides a comprehensive overview of controllable generation using text-to-image diffusion models, covering both theoretical foundations and practical applications. Diffusion models, which operate on the principle of reverse diffusion, have significantly outperformed traditional frameworks like GANs in image generation. However, relying solely on text for conditioning these models is insufficient for diverse applications, prompting research into integrating novel conditions beyond text. The survey begins with an introduction to denoising diffusion probabilistic models (DDPMs) and widely used text-to-image diffusion models. It then explores the mechanisms for introducing novel conditions into the denoising process, categorizing controllable generation into three sub-tasks: generation with specific conditions, generation with multiple conditions, and universal controllable generation. The survey also discusses various methods for controlling text-to-image diffusion models, including conditional score prediction and condition-guided score estimation. It highlights the importance of integrating novel conditions into the generation process to meet diverse user needs. The survey reviews existing approaches for controlling text-to-image diffusion models, emphasizing the role of conditional score prediction and condition-guided score estimation in enabling precise control. It also explores the applications of these methods in various generative tasks, demonstrating their emergence as a fundamental aspect in the AIGC era. The survey concludes with a detailed analysis of different methods for controlling text-to-image diffusion models, including model-based, tuning-based, and training-free approaches, and their effectiveness in achieving controllable generation.