[slides] CAMixerSR%3A Only Details Need More %E2%80%9CAttention%E2%80%9D

CAMixerSR: Only Details Need More "Attention" This paper proposes a content-aware mixer (CAMixer) to address the limitations of existing methods in large image (2K-8K) super-resolution (SR). Current methods either accelerate networks through content-aware routing or design better SR networks via token mixer refining, but they face issues like inflexible routing and non-discriminative processing, limiting quality-complexity trade-offs. To overcome these, CAMixer integrates convolution and self-attention, dynamically assigning simple areas to convolution and complex areas to self-attention. It uses a predictor to generate offsets, masks, and spatial/channel attentions, modulating attention to include more useful textures and improving convolution representation. A global classification loss is introduced to enhance predictor accuracy. By stacking CAMixers, CAMixerSR achieves superior performance in large-image SR, lightweight SR, and omnidirectional-image SR. The CAMixer consists of three components: a predictor module, an attention branch, and a convolution branch. The predictor generates offsets, masks, and attentions to guide the attention and convolution branches. The attention branch processes complex areas with self-attention, while the convolution branch handles simple areas. The predictor is trained to generate accurate masks for content-aware routing. The CAMixerSR is built based on SwinIR-light, integrating CAMixer for efficient SR. It achieves state-of-the-art quality-computation trade-offs on three challenging tasks: lightweight SR, large-input SR, and omnidirectional-image SR. Experiments show that CAMixerSR outperforms existing methods in PSNR and SSIM, with significant improvements in lightweight SR and large-image SR. It also performs well in omnidirectional-image SR, achieving better restoration quality than other methods. The CAMixerSR is efficient, with reduced computational costs and improved performance. The method is content-aware, dynamically adjusting the complexity of neural operators based on the content. It can work with other strategies without difficulty, making it suitable for large-image tasks. The results demonstrate that CAMixerSR is a better choice for large-image tasks and can work with other strategies, without any difficulty.CAMixerSR: Only Details Need More "Attention" This paper proposes a content-aware mixer (CAMixer) to address the limitations of existing methods in large image (2K-8K) super-resolution (SR). Current methods either accelerate networks through content-aware routing or design better SR networks via token mixer refining, but they face issues like inflexible routing and non-discriminative processing, limiting quality-complexity trade-offs. To overcome these, CAMixer integrates convolution and self-attention, dynamically assigning simple areas to convolution and complex areas to self-attention. It uses a predictor to generate offsets, masks, and spatial/channel attentions, modulating attention to include more useful textures and improving convolution representation. A global classification loss is introduced to enhance predictor accuracy. By stacking CAMixers, CAMixerSR achieves superior performance in large-image SR, lightweight SR, and omnidirectional-image SR. The CAMixer consists of three components: a predictor module, an attention branch, and a convolution branch. The predictor generates offsets, masks, and attentions to guide the attention and convolution branches. The attention branch processes complex areas with self-attention, while the convolution branch handles simple areas. The predictor is trained to generate accurate masks for content-aware routing. The CAMixerSR is built based on SwinIR-light, integrating CAMixer for efficient SR. It achieves state-of-the-art quality-computation trade-offs on three challenging tasks: lightweight SR, large-input SR, and omnidirectional-image SR. Experiments show that CAMixerSR outperforms existing methods in PSNR and SSIM, with significant improvements in lightweight SR and large-image SR. It also performs well in omnidirectional-image SR, achieving better restoration quality than other methods. The CAMixerSR is efficient, with reduced computational costs and improved performance. The method is content-aware, dynamically adjusting the complexity of neural operators based on the content. It can work with other strategies without difficulty, making it suitable for large-image tasks. The results demonstrate that CAMixerSR is a better choice for large-image tasks and can work with other strategies, without any difficulty.

CAMixerSR: Only Details Need More "Attention"

15 Mar 2024 | Yan Wang, Yi Liu, Shijie Zhao, Junlin Li, Li Zhang