GenArtist is a unified image generation and editing system designed to address the limitations of existing models, which struggle with complex tasks and lack reliability. The system is coordinated by a multimodal large language model (MLLM) agent that decomposes complex problems into simpler sub-problems and constructs a planning tree for systematic execution. This tree structure allows for step-by-step verification and self-correction, ensuring the accuracy of generated images. GenArtist integrates a comprehensive library of tools for both generation and editing, and it can automatically generate missing position-related inputs, enhancing the effectiveness of tool selection. Experiments demonstrate that GenArtist outperforms state-of-the-art models like SDXL and DALL-E 3, achieving significant improvements in various tasks, including text-to-image generation and complex image editing. The system's ability to handle diverse user requirements and its unified approach to generation and editing make it a valuable contribution to the field of image generation and editing.GenArtist is a unified image generation and editing system designed to address the limitations of existing models, which struggle with complex tasks and lack reliability. The system is coordinated by a multimodal large language model (MLLM) agent that decomposes complex problems into simpler sub-problems and constructs a planning tree for systematic execution. This tree structure allows for step-by-step verification and self-correction, ensuring the accuracy of generated images. GenArtist integrates a comprehensive library of tools for both generation and editing, and it can automatically generate missing position-related inputs, enhancing the effectiveness of tool selection. Experiments demonstrate that GenArtist outperforms state-of-the-art models like SDXL and DALL-E 3, achieving significant improvements in various tasks, including text-to-image generation and complex image editing. The system's ability to handle diverse user requirements and its unified approach to generation and editing make it a valuable contribution to the field of image generation and editing.