22 Apr 2017 | A. Garcia-Garcia, S. Orts-Escolano, S.O. Oprea, V. Villena-Martinez, and J. Garcia-Rodriguez
This paper provides a comprehensive review of deep learning techniques applied to semantic segmentation, a critical task in computer vision. The review covers the terminology, background concepts, datasets, challenges, and existing methods. It highlights the evolution from coarse to fine-grained inference, emphasizing the role of deep learning in achieving high accuracy and efficiency. The paper discusses key deep network architectures such as AlexNet, VGG-16, GoogLeNet, and ResNet, and their contributions to semantic segmentation. It also covers transfer learning, data preprocessing, and augmentation techniques. The review details popular datasets for 2D, 2.5D, and 3D semantic segmentation, including PASCAL VOC, NYUDv2, SUN3D, and ShapeNet Part. The paper then reviews various methods for semantic segmentation, focusing on decoder variants, integrating context knowledge (using CRFs, dilated convolutions, multi-scale prediction, feature fusion, and recurrent neural networks), and their contributions to accuracy, efficiency, and training simplicity. Finally, it concludes with a discussion on future research directions and the state of the art in deep learning for semantic segmentation.This paper provides a comprehensive review of deep learning techniques applied to semantic segmentation, a critical task in computer vision. The review covers the terminology, background concepts, datasets, challenges, and existing methods. It highlights the evolution from coarse to fine-grained inference, emphasizing the role of deep learning in achieving high accuracy and efficiency. The paper discusses key deep network architectures such as AlexNet, VGG-16, GoogLeNet, and ResNet, and their contributions to semantic segmentation. It also covers transfer learning, data preprocessing, and augmentation techniques. The review details popular datasets for 2D, 2.5D, and 3D semantic segmentation, including PASCAL VOC, NYUDv2, SUN3D, and ShapeNet Part. The paper then reviews various methods for semantic segmentation, focusing on decoder variants, integrating context knowledge (using CRFs, dilated convolutions, multi-scale prediction, feature fusion, and recurrent neural networks), and their contributions to accuracy, efficiency, and training simplicity. Finally, it concludes with a discussion on future research directions and the state of the art in deep learning for semantic segmentation.