7 Jul 2024 | Marcos V. Conde, Gregor Geigle, Radu Timofte
InstructIR: High-Quality Image Restoration Following Human Instructions
**Authors:** Marcos V. Conde, Gregor Geigle, Radu Timofte
**Affiliation:** Computer Vision Lab, CAIDAS & IFI, University of Würzburg; Sony PlayStation, FTG
**GitHub:** https://github.com/mv-lab/InstructIR
**Abstract:**
Image restoration is a fundamental problem that aims to recover high-quality images from degraded observations. In this work, we present InstructIR, the first approach that uses human-written instructions to guide image restoration models. Given natural language prompts, InstructIR can recover high-quality images from their degraded counterparts, considering multiple degradation types. Our method achieves state-of-the-art results on several restoration tasks, including image denoising, deraining, deblurring, dehazing, and low-light image enhancement. InstructIR improves over previous all-in-one restoration methods by +1dB. Our dataset and results represent a novel benchmark for text-guided image restoration and enhancement.
**Contributions:**
- First approach to use human-written instructions to guide image restoration.
- Achieves state-of-the-art performance on various image restoration tasks.
- Generalizes to restoring images using arbitrary human-written instructions.
- Single all-in-one model covers more tasks than many previous works.
**Related Work:**
- Image restoration methods focus on specific degradations or use general neural networks for diverse tasks.
- All-in-one image restoration models tackle multiple degradation types and levels using a single deep blind restoration model.
- Text-guided image manipulation methods use text prompts to guide image generation and editing.
**Method:**
- InstructIR consists of an image model (NAFNet) and a text encoder.
- The text encoder maps user prompts to fixed-size vectors, enabling task-specific transformations within the model.
- Task routing techniques are used to condition the features based on the text embedding.
**Experimental Results:**
- Extensive qualitative and quantitative results demonstrate the effectiveness of InstructIR on various image restoration tasks.
- InstructIR outperforms previous methods in multiple tasks, achieving competitive results.
- Ablation studies show the importance of text guidance and task routing.
**Conclusion:**
InstructIR is a powerful all-in-one model that uses human-written instructions to guide image restoration. It achieves state-of-the-art results and represents a novel benchmark for text-guided image restoration.InstructIR: High-Quality Image Restoration Following Human Instructions
**Authors:** Marcos V. Conde, Gregor Geigle, Radu Timofte
**Affiliation:** Computer Vision Lab, CAIDAS & IFI, University of Würzburg; Sony PlayStation, FTG
**GitHub:** https://github.com/mv-lab/InstructIR
**Abstract:**
Image restoration is a fundamental problem that aims to recover high-quality images from degraded observations. In this work, we present InstructIR, the first approach that uses human-written instructions to guide image restoration models. Given natural language prompts, InstructIR can recover high-quality images from their degraded counterparts, considering multiple degradation types. Our method achieves state-of-the-art results on several restoration tasks, including image denoising, deraining, deblurring, dehazing, and low-light image enhancement. InstructIR improves over previous all-in-one restoration methods by +1dB. Our dataset and results represent a novel benchmark for text-guided image restoration and enhancement.
**Contributions:**
- First approach to use human-written instructions to guide image restoration.
- Achieves state-of-the-art performance on various image restoration tasks.
- Generalizes to restoring images using arbitrary human-written instructions.
- Single all-in-one model covers more tasks than many previous works.
**Related Work:**
- Image restoration methods focus on specific degradations or use general neural networks for diverse tasks.
- All-in-one image restoration models tackle multiple degradation types and levels using a single deep blind restoration model.
- Text-guided image manipulation methods use text prompts to guide image generation and editing.
**Method:**
- InstructIR consists of an image model (NAFNet) and a text encoder.
- The text encoder maps user prompts to fixed-size vectors, enabling task-specific transformations within the model.
- Task routing techniques are used to condition the features based on the text embedding.
**Experimental Results:**
- Extensive qualitative and quantitative results demonstrate the effectiveness of InstructIR on various image restoration tasks.
- InstructIR outperforms previous methods in multiple tasks, achieving competitive results.
- Ablation studies show the importance of text guidance and task routing.
**Conclusion:**
InstructIR is a powerful all-in-one model that uses human-written instructions to guide image restoration. It achieves state-of-the-art results and represents a novel benchmark for text-guided image restoration.