23 May 2024 | Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang
AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation
**Authors:** Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang, China Mobile Research Institute, Nanjing University of Science and Technology
**Abstract:**
Blind super-resolution (BSR) methods based on stable diffusion models have shown significant potential in reconstructing high-resolution images from low-resolution (LR) inputs. However, their practical application is often hindered by the inefficiency of the sampling process, which typically requires thousands or hundreds of steps. To address this issue, AddSR incorporates ideas from adversarial diffusion distillation (ADD) and ControlNet to enhance efficiency while maintaining high-quality restoration results. Specifically, AddSR introduces a prediction-based self-refinement strategy to provide high-frequency information in the student model output with minimal additional time cost. It also refines the training process by using high-resolution (HR) images instead of LR images to regulate the teacher model, providing more robust constraints for distillation. Additionally, AddSR employs a timestep-adaptive ADD to address the perception-distortion imbalance problem introduced by the original ADD. Extensive experiments demonstrate that AddSR achieves better restoration results and is significantly faster than previous state-of-the-art models, achieving 7× faster speed than SeeSR.
AddSR is a novel method for blind super-resolution based on adversarial diffusion distillation (ADD). It enhances restoration effects and accelerates inference speed simultaneously. AddSR consists of a teacher model and a student model, incorporating ControlNet and CLIP to receive image and text information for multi-modal restoration. The key contributions of AddSR include:
- A novel method called AddSR that achieves high perception quality within 0.8 seconds, making it 7× faster than SeeSR while providing better restoration performance.
- Prediction-based self-refinement (PSR) to provide high-quality controlling signals for regulating output.
- Timestep-adaptive ADD (TA-ADD) to achieve a perception-distortion trade-off.
**Related Work:**
- **GAN-based BSR:** Methods like BSRGAN, Real-ESRGAN, and KDSRGAN have shown significant improvements in BSR tasks but struggle with complex natural images.
- **Diffusion-based BSR:** Methods leveraging Stable Diffusion (SD) prior have excelled in performance but suffer from high computational costs.
- **Efficient Diffusion Models:** Works like adversarial diffusion distillation aim to reduce inference time, but they often compromise restoration quality.
**Methodology:**
- **Network Components:** AddSR includes a student model, a pre-trained teacher model, and a discriminator. The student model is initialized from the teacher model and incorporates ControlNet and CLIP for multi-modal restoration.
- **Training Procedure:** AddSR uses a prediction-based self-refinement strategy to enhance image restoration and a timestep-adaptive ADDAddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation
**Authors:** Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang, China Mobile Research Institute, Nanjing University of Science and Technology
**Abstract:**
Blind super-resolution (BSR) methods based on stable diffusion models have shown significant potential in reconstructing high-resolution images from low-resolution (LR) inputs. However, their practical application is often hindered by the inefficiency of the sampling process, which typically requires thousands or hundreds of steps. To address this issue, AddSR incorporates ideas from adversarial diffusion distillation (ADD) and ControlNet to enhance efficiency while maintaining high-quality restoration results. Specifically, AddSR introduces a prediction-based self-refinement strategy to provide high-frequency information in the student model output with minimal additional time cost. It also refines the training process by using high-resolution (HR) images instead of LR images to regulate the teacher model, providing more robust constraints for distillation. Additionally, AddSR employs a timestep-adaptive ADD to address the perception-distortion imbalance problem introduced by the original ADD. Extensive experiments demonstrate that AddSR achieves better restoration results and is significantly faster than previous state-of-the-art models, achieving 7× faster speed than SeeSR.
AddSR is a novel method for blind super-resolution based on adversarial diffusion distillation (ADD). It enhances restoration effects and accelerates inference speed simultaneously. AddSR consists of a teacher model and a student model, incorporating ControlNet and CLIP to receive image and text information for multi-modal restoration. The key contributions of AddSR include:
- A novel method called AddSR that achieves high perception quality within 0.8 seconds, making it 7× faster than SeeSR while providing better restoration performance.
- Prediction-based self-refinement (PSR) to provide high-quality controlling signals for regulating output.
- Timestep-adaptive ADD (TA-ADD) to achieve a perception-distortion trade-off.
**Related Work:**
- **GAN-based BSR:** Methods like BSRGAN, Real-ESRGAN, and KDSRGAN have shown significant improvements in BSR tasks but struggle with complex natural images.
- **Diffusion-based BSR:** Methods leveraging Stable Diffusion (SD) prior have excelled in performance but suffer from high computational costs.
- **Efficient Diffusion Models:** Works like adversarial diffusion distillation aim to reduce inference time, but they often compromise restoration quality.
**Methodology:**
- **Network Components:** AddSR includes a student model, a pre-trained teacher model, and a discriminator. The student model is initialized from the teacher model and incorporates ControlNet and CLIP for multi-modal restoration.
- **Training Procedure:** AddSR uses a prediction-based self-refinement strategy to enhance image restoration and a timestep-adaptive ADD