7 Jun 2016 | Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille
This paper presents a novel approach to semantic image segmentation using deep convolutional neural networks (DCNNs) and fully connected Conditional Random Fields (CRFs). The authors address the challenge of accurate pixel-level classification by combining the final layer outputs of DCNNs with a fully connected CRF. This method, named "DeepLab," overcomes the poor localization property of deep networks, which is due to their invariance to spatial transformations. The system achieves state-of-the-art performance on the PASCAL VOC-2012 dataset, reaching 71.6% Intersection over Union (IoU) accuracy. The key contributions include efficient dense computation of DCNN responses using the 'atrous' algorithm and the use of a fully connected pairwise CRF for accurate localization. The system is composed of two well-established modules—DCNNs and CRFs—and offers advantages in speed, accuracy, and simplicity.This paper presents a novel approach to semantic image segmentation using deep convolutional neural networks (DCNNs) and fully connected Conditional Random Fields (CRFs). The authors address the challenge of accurate pixel-level classification by combining the final layer outputs of DCNNs with a fully connected CRF. This method, named "DeepLab," overcomes the poor localization property of deep networks, which is due to their invariance to spatial transformations. The system achieves state-of-the-art performance on the PASCAL VOC-2012 dataset, reaching 71.6% Intersection over Union (IoU) accuracy. The key contributions include efficient dense computation of DCNN responses using the 'atrous' algorithm and the use of a fully connected pairwise CRF for accurate localization. The system is composed of two well-established modules—DCNNs and CRFs—and offers advantages in speed, accuracy, and simplicity.