SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

10 Oct 2016 | Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla, Senior Member, IEEE,
The paper introduces SegNet, a novel deep fully convolutional neural network architecture designed for semantic pixel-wise image segmentation. The architecture consists of an encoder network, a decoder network, and a pixel-wise classification layer. The encoder network is identical to the VGG16 network, while the decoder network uses max-pooling indices computed during the max-pooling step of the encoder to perform non-linear upsampling, eliminating the need for learning to upsample. This approach improves boundary delineation and reduces the number of parameters, making it suitable for end-to-end training. SegNet is designed to be efficient in terms of memory and computational time, with a smaller number of trainable parameters compared to other architectures. The paper compares SegNet with other popular architectures such as FCN, DeepLab-LargeFOV, and DeconNet, highlighting the trade-offs between memory and accuracy. SegNet is evaluated on road scene and indoor scene segmentation tasks, demonstrating superior performance with competitive inference time and memory efficiency. The authors also provide a Caffe implementation and a web demo for users to try the system.The paper introduces SegNet, a novel deep fully convolutional neural network architecture designed for semantic pixel-wise image segmentation. The architecture consists of an encoder network, a decoder network, and a pixel-wise classification layer. The encoder network is identical to the VGG16 network, while the decoder network uses max-pooling indices computed during the max-pooling step of the encoder to perform non-linear upsampling, eliminating the need for learning to upsample. This approach improves boundary delineation and reduces the number of parameters, making it suitable for end-to-end training. SegNet is designed to be efficient in terms of memory and computational time, with a smaller number of trainable parameters compared to other architectures. The paper compares SegNet with other popular architectures such as FCN, DeepLab-LargeFOV, and DeconNet, highlighting the trade-offs between memory and accuracy. SegNet is evaluated on road scene and indoor scene segmentation tasks, demonstrating superior performance with competitive inference time and memory efficiency. The authors also provide a Caffe implementation and a web demo for users to try the system.
Reach us at info@study.space