25 Jan 2017 | Wenjie Luo*, Yujia Li*, Raquel Urtasun, Richard Zemel
This paper investigates the characteristics of receptive fields in deep convolutional neural networks (CNNs), focusing on their effective receptive fields (ERFs). The authors introduce the concept of ERF, which is the region within the input image that significantly influences the output of a unit in the network. They find that the distribution of impact within an ERF follows a Gaussian distribution and that the ERF only occupies a fraction of the theoretical receptive field. The paper analyzes how various architectural designs, nonlinear activations, dropout, sub-sampling, and skip connections affect the ERF. Empirical results support the theoretical findings, showing that random initializations often lead to a small ERF that grows during training. The authors propose new initialization methods and architectural changes to increase the effective receptive field, such as adjusting the weights at the center of the convolution kernel and using sparse connections. The study highlights the importance of understanding ERFs for tasks requiring large receptive fields, such as semantic segmentation and object detection.This paper investigates the characteristics of receptive fields in deep convolutional neural networks (CNNs), focusing on their effective receptive fields (ERFs). The authors introduce the concept of ERF, which is the region within the input image that significantly influences the output of a unit in the network. They find that the distribution of impact within an ERF follows a Gaussian distribution and that the ERF only occupies a fraction of the theoretical receptive field. The paper analyzes how various architectural designs, nonlinear activations, dropout, sub-sampling, and skip connections affect the ERF. Empirical results support the theoretical findings, showing that random initializations often lead to a small ERF that grows during training. The authors propose new initialization methods and architectural changes to increase the effective receptive field, such as adjusting the weights at the center of the convolution kernel and using sparse connections. The study highlights the importance of understanding ERFs for tasks requiring large receptive fields, such as semantic segmentation and object detection.