19 Apr 2017 | David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba
The paper "Network Dissection: Quantifying Interpretability of Deep Visual Representations" by David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba introduces a framework called Network Dissection to evaluate the interpretability of latent representations in Convolutional Neural Networks (CNNs). The framework assesses the alignment between hidden units and a set of semantic concepts, using a broadly and densely labeled dataset called Broden. The authors test the hypothesis that interpretability is equivalent to random linear combinations of units and compare the latent representations of various networks trained on different tasks. They also analyze the impact of training iterations, network initialization, depth, width, dropout, and batch normalization on interpretability. The results show that interpretability is not an axis-independent property and can be affected by training conditions, with deeper architectures generally showing higher interpretability. The paper concludes that interpretability is a distinct quality from discriminative power and provides insights into the characteristics of CNN models and training methods.The paper "Network Dissection: Quantifying Interpretability of Deep Visual Representations" by David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba introduces a framework called Network Dissection to evaluate the interpretability of latent representations in Convolutional Neural Networks (CNNs). The framework assesses the alignment between hidden units and a set of semantic concepts, using a broadly and densely labeled dataset called Broden. The authors test the hypothesis that interpretability is equivalent to random linear combinations of units and compare the latent representations of various networks trained on different tasks. They also analyze the impact of training iterations, network initialization, depth, width, dropout, and batch normalization on interpretability. The results show that interpretability is not an axis-independent property and can be affected by training conditions, with deeper architectures generally showing higher interpretability. The paper concludes that interpretability is a distinct quality from discriminative power and provides insights into the characteristics of CNN models and training methods.