Restormer: Efficient Transformer for High-Resolution Image Restoration

Restormer: Efficient Transformer for High-Resolution Image Restoration

2022-03-11 | Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang
Restormer is an efficient Transformer model designed for high-resolution image restoration. It addresses the computational inefficiency of traditional Transformers by introducing key design improvements in the multi-head attention and feed-forward network components. The model, named Restoration Transformer (Restormer), achieves state-of-the-art results on various image restoration tasks, including image deraining, single-image motion deblurring, defocus deblurring, and image denoising. The model uses a multi-Dconv head transposed attention (MDTA) block to efficiently capture long-range pixel interactions and a gated-Dconv feed-forward network (GDFN) to perform controlled feature transformation. Restormer also employs a progressive learning strategy to train on small patches initially and gradually larger patches later, enabling it to learn context from large images and improve performance at test time. Comprehensive experiments on 16 benchmark datasets demonstrate that Restormer achieves superior performance across multiple image restoration tasks, with significant improvements in PSNR and SSIM scores compared to existing methods. The model is computationally efficient, with linear complexity in the MDTA block and a reduced number of parameters and operations compared to other Transformer-based models. Restormer is trained on the GoPro dataset and shows strong generalization to other datasets, achieving new state-of-the-art results on image restoration tasks.Restormer is an efficient Transformer model designed for high-resolution image restoration. It addresses the computational inefficiency of traditional Transformers by introducing key design improvements in the multi-head attention and feed-forward network components. The model, named Restoration Transformer (Restormer), achieves state-of-the-art results on various image restoration tasks, including image deraining, single-image motion deblurring, defocus deblurring, and image denoising. The model uses a multi-Dconv head transposed attention (MDTA) block to efficiently capture long-range pixel interactions and a gated-Dconv feed-forward network (GDFN) to perform controlled feature transformation. Restormer also employs a progressive learning strategy to train on small patches initially and gradually larger patches later, enabling it to learn context from large images and improve performance at test time. Comprehensive experiments on 16 benchmark datasets demonstrate that Restormer achieves superior performance across multiple image restoration tasks, with significant improvements in PSNR and SSIM scores compared to existing methods. The model is computationally efficient, with linear complexity in the MDTA block and a reduced number of parameters and operations compared to other Transformer-based models. Restormer is trained on the GoPro dataset and shows strong generalization to other datasets, achieving new state-of-the-art results on image restoration tasks.
Reach us at info@study.space
[slides and audio] Restormer%3A Efficient Transformer for High-Resolution Image Restoration