Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

19 Apr 2014 | Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
This paper presents two visualization techniques for deep convolutional networks (ConvNets) used in image classification. The first technique generates an image that maximizes the class score, thereby visualizing the class concept captured by the ConvNet. The second technique computes a class-specific saliency map, highlighting the regions of an image that are most relevant to a given class. These saliency maps can be used for weakly supervised object segmentation. The paper also establishes a connection between gradient-based visualization methods and deconvolutional networks. The first method, class model visualization, involves numerically optimizing an input image to maximize the score of a specific class, using back-propagation. This method is applied to a ConvNet trained on the ImageNet dataset, resulting in visualizations that represent the class in terms of the ConvNet's classification model. The second method, image-specific class saliency visualization, computes a saliency map by calculating the gradient of the class score with respect to the input image. This map highlights the regions of the image that most influence the class score. The method is applied to a ConvNet trained on the ImageNet dataset, and the resulting saliency maps are used for weakly supervised object localization. The paper also shows that gradient-based visualization methods can be used to reconstruct the input of each layer in a ConvNet, similar to deconvolutional networks. The techniques are applied to a ConvNet trained on the ImageNet dataset, achieving a top-5 error rate of 46.4% on the test set. The paper concludes that gradient-based visualization techniques generalize the deconvolutional network reconstruction procedure and can be applied to any layer in a ConvNet, not just convolutional layers.This paper presents two visualization techniques for deep convolutional networks (ConvNets) used in image classification. The first technique generates an image that maximizes the class score, thereby visualizing the class concept captured by the ConvNet. The second technique computes a class-specific saliency map, highlighting the regions of an image that are most relevant to a given class. These saliency maps can be used for weakly supervised object segmentation. The paper also establishes a connection between gradient-based visualization methods and deconvolutional networks. The first method, class model visualization, involves numerically optimizing an input image to maximize the score of a specific class, using back-propagation. This method is applied to a ConvNet trained on the ImageNet dataset, resulting in visualizations that represent the class in terms of the ConvNet's classification model. The second method, image-specific class saliency visualization, computes a saliency map by calculating the gradient of the class score with respect to the input image. This map highlights the regions of the image that most influence the class score. The method is applied to a ConvNet trained on the ImageNet dataset, and the resulting saliency maps are used for weakly supervised object localization. The paper also shows that gradient-based visualization methods can be used to reconstruct the input of each layer in a ConvNet, similar to deconvolutional networks. The techniques are applied to a ConvNet trained on the ImageNet dataset, achieving a top-5 error rate of 46.4% on the test set. The paper concludes that gradient-based visualization techniques generalize the deconvolutional network reconstruction procedure and can be applied to any layer in a ConvNet, not just convolutional layers.
Reach us at info@study.space
[slides] Deep Inside Convolutional Networks%3A Visualising Image Classification Models and Saliency Maps | StudySpace