28 May 2017 | Fisher Yu, Vladlen Koltun, Thomas Funkhouser
The paper introduces Dilated Residual Networks (DRNs), a novel architecture that retains high spatial resolution in convolutional networks for image classification. Traditional convolutional networks progressively reduce resolution, leading to a loss of spatial information that can hinder accuracy and transferability to downstream tasks requiring detailed scene understanding. DRNs address this issue by using dilation to increase the resolution of output feature maps without reducing the receptive field of individual neurons. The authors demonstrate that DRNs outperform non-dilated models in ImageNet classification without increasing model depth or complexity. They also introduce a method to remove gridding artifacts introduced by dilation, further improving performance. Additionally, DRNs show superior accuracy in downstream applications such as object localization and semantic segmentation, achieving state-of-the-art results with minimal fine-tuning. The key idea is to preserve spatial resolution throughout the network, which is crucial for understanding complex natural scenes and for tasks that require detailed spatial analysis.The paper introduces Dilated Residual Networks (DRNs), a novel architecture that retains high spatial resolution in convolutional networks for image classification. Traditional convolutional networks progressively reduce resolution, leading to a loss of spatial information that can hinder accuracy and transferability to downstream tasks requiring detailed scene understanding. DRNs address this issue by using dilation to increase the resolution of output feature maps without reducing the receptive field of individual neurons. The authors demonstrate that DRNs outperform non-dilated models in ImageNet classification without increasing model depth or complexity. They also introduce a method to remove gridding artifacts introduced by dilation, further improving performance. Additionally, DRNs show superior accuracy in downstream applications such as object localization and semantic segmentation, achieving state-of-the-art results with minimal fine-tuning. The key idea is to preserve spatial resolution throughout the network, which is crucial for understanding complex natural scenes and for tasks that require detailed spatial analysis.