April 5, 2024 | T. S. Arulananth, P. G. Kuppusamy, Ramesh Kumar Ayyasamy, Saadat M. Alhashmi, M. Mahalakshmi, K. Vasanth, P. Chinnasamy
This paper explores the application of the U-Net deep learning model for semantic segmentation of urban environments, specifically cityscape images. The U-Net architecture, characterized by its encoder-decoder structure, is designed to extract hierarchical features from input images and reconstruct high-resolution feature maps. The encoder uses convolutional layers and downsampling to reduce spatial dimensions while increasing feature depth, aiding in context acquisition. Batch normalization and dropout layers stabilize the model and prevent overfitting. The decoder employs up-sampling layers to reconstruct high-resolution feature maps, with skip connections facilitating the integration of low-level and high-level features.
The study evaluates the U-Net model on the Cityscapes dataset, demonstrating its effectiveness in achieving state-of-the-art results in image segmentation. The proposed model outperforms existing methods in terms of accuracy, mean Intersection over Union (IoU), and mean DICE scores. The paper also discusses the limitations and future scope of the method, including the need for improved reliability, generalization, and real-time computation. The results highlight the potential of the U-Net model in advancing urban planning, transportation management, and autonomous driving applications.This paper explores the application of the U-Net deep learning model for semantic segmentation of urban environments, specifically cityscape images. The U-Net architecture, characterized by its encoder-decoder structure, is designed to extract hierarchical features from input images and reconstruct high-resolution feature maps. The encoder uses convolutional layers and downsampling to reduce spatial dimensions while increasing feature depth, aiding in context acquisition. Batch normalization and dropout layers stabilize the model and prevent overfitting. The decoder employs up-sampling layers to reconstruct high-resolution feature maps, with skip connections facilitating the integration of low-level and high-level features.
The study evaluates the U-Net model on the Cityscapes dataset, demonstrating its effectiveness in achieving state-of-the-art results in image segmentation. The proposed model outperforms existing methods in terms of accuracy, mean Intersection over Union (IoU), and mean DICE scores. The paper also discusses the limitations and future scope of the method, including the need for improved reliability, generalization, and real-time computation. The results highlight the potential of the U-Net model in advancing urban planning, transportation management, and autonomous driving applications.