This survey provides an overview of vision-language-action (VLA) models in embodied AI, highlighting their role in enabling robots to understand and execute complex tasks by integrating vision, language, and action modalities. VLAs have emerged as a critical component in embodied AI, offering superior versatility, dexterity, and generalizability compared to traditional reinforcement learning approaches. The survey discusses the evolution of unimodal models, the development of vision-language models, and the integration of these into VLA frameworks. It categorizes VLAs into three main components: pretraining, control policy, and task planner, each contributing to the overall effectiveness of robotic systems. The survey also explores various pretraining methods, including contrastive learning, masked autoencoding, and world model learning, which enhance the model's ability to understand and predict environmental dynamics. Additionally, it examines different types of low-level control policies, including non-Transformer, Transformer-based, and large language model (LLM)-based approaches, each with unique strengths and applications. The survey also addresses challenges in VLA development, such as data scarcity, robot dexterity, and generalization across tasks and environments. It concludes with future directions for research in VLA models, emphasizing the need for further advancements in these areas to improve the capabilities of embodied AI systems.This survey provides an overview of vision-language-action (VLA) models in embodied AI, highlighting their role in enabling robots to understand and execute complex tasks by integrating vision, language, and action modalities. VLAs have emerged as a critical component in embodied AI, offering superior versatility, dexterity, and generalizability compared to traditional reinforcement learning approaches. The survey discusses the evolution of unimodal models, the development of vision-language models, and the integration of these into VLA frameworks. It categorizes VLAs into three main components: pretraining, control policy, and task planner, each contributing to the overall effectiveness of robotic systems. The survey also explores various pretraining methods, including contrastive learning, masked autoencoding, and world model learning, which enhance the model's ability to understand and predict environmental dynamics. Additionally, it examines different types of low-level control policies, including non-Transformer, Transformer-based, and large language model (LLM)-based approaches, each with unique strengths and applications. The survey also addresses challenges in VLA development, such as data scarcity, robot dexterity, and generalization across tasks and environments. It concludes with future directions for research in VLA models, emphasizing the need for further advancements in these areas to improve the capabilities of embodied AI systems.