SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

7 May 2024 | Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan
SEED-Data-Edit is a hybrid dataset for instruction-guided image editing, combining three types of data: automated pipeline-generated edits, real-world scenario data, and human-annotated multi-turn editing data. It aims to facilitate image manipulation using open-form language. The dataset includes 3.7 million image editing pairs and 21,000 multi-turn editing sequences, with up to five rounds per sequence. The dataset is created by combining large-scale automated pipeline-generated edits, real-world scenario data, and multi-turn editing data annotated by Photoshop experts. The dataset is used to fine-tune a pre-trained Multimodal Large Language Model (MLLM) called SEED-X, resulting in the instruction-tuned model SEED-X-Edit. SEED-X-Edit demonstrates promising results in language-guided image editing, showing the potential of SEED-Data-Edit in advancing the field. The dataset and model are released for use. SEED-Data-Edit addresses the challenge of training models for instruction-guided image editing by providing a comprehensive and diverse dataset that includes high-quality editing data, real-world scenario data, and human-annotated multi-turn editing data. The dataset is designed to support the development of language-guided image editing models. The dataset is released on Hugging Face.SEED-Data-Edit is a hybrid dataset for instruction-guided image editing, combining three types of data: automated pipeline-generated edits, real-world scenario data, and human-annotated multi-turn editing data. It aims to facilitate image manipulation using open-form language. The dataset includes 3.7 million image editing pairs and 21,000 multi-turn editing sequences, with up to five rounds per sequence. The dataset is created by combining large-scale automated pipeline-generated edits, real-world scenario data, and multi-turn editing data annotated by Photoshop experts. The dataset is used to fine-tune a pre-trained Multimodal Large Language Model (MLLM) called SEED-X, resulting in the instruction-tuned model SEED-X-Edit. SEED-X-Edit demonstrates promising results in language-guided image editing, showing the potential of SEED-Data-Edit in advancing the field. The dataset and model are released for use. SEED-Data-Edit addresses the challenge of training models for instruction-guided image editing by providing a comprehensive and diverse dataset that includes high-quality editing data, real-world scenario data, and human-annotated multi-turn editing data. The dataset is designed to support the development of language-guided image editing models. The dataset is released on Hugging Face.
Reach us at info@study.space
[slides] SEED-Data-Edit Technical Report%3A A Hybrid Dataset for Instructional Image Editing | StudySpace