HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

15 Apr 2024 | Mude Hui†, Siwei Yang†, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang, Yuyin Zhou, Cihang Xie
**HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing** This study introduces HQ-Edit, a high-quality dataset for instruction-based image editing, featuring around 200,000 edits. Unlike previous approaches that rely on attribute guidance or human feedback, HQ-Edit leverages advanced foundation models, GPT-4V and DALL-E 3, to create a scalable data collection pipeline. The dataset is curated through a three-step process: Expansion, Generation, and Post-processing. Initially, diverse seed triplets are collected online and expanded using GPT-4 to generate detailed diptychs with input and output images. These diptychs are then refined using GPT-4V to improve alignment and coherence. Two new evaluation metrics, Alignment and Coherence, are introduced to quantitatively assess the quality of image edit pairs. HQ-Edit's high-resolution images and comprehensive editing prompts significantly enhance the capabilities of existing image editing models, with fine-tuned InstructPix2Pix achieving state-of-the-art performance. The dataset's superior quality is demonstrated through extensive experiments and human evaluations, showing that it outperforms other datasets in terms of alignment and coherence.**HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing** This study introduces HQ-Edit, a high-quality dataset for instruction-based image editing, featuring around 200,000 edits. Unlike previous approaches that rely on attribute guidance or human feedback, HQ-Edit leverages advanced foundation models, GPT-4V and DALL-E 3, to create a scalable data collection pipeline. The dataset is curated through a three-step process: Expansion, Generation, and Post-processing. Initially, diverse seed triplets are collected online and expanded using GPT-4 to generate detailed diptychs with input and output images. These diptychs are then refined using GPT-4V to improve alignment and coherence. Two new evaluation metrics, Alignment and Coherence, are introduced to quantitatively assess the quality of image edit pairs. HQ-Edit's high-resolution images and comprehensive editing prompts significantly enhance the capabilities of existing image editing models, with fine-tuned InstructPix2Pix achieving state-of-the-art performance. The dataset's superior quality is demonstrated through extensive experiments and human evaluations, showing that it outperforms other datasets in terms of alignment and coherence.
Reach us at info@study.space