Understanding ICNet for Real-Time Semantic Segmentation on High-Resolution Images

This paper presents ICNet, a real-time semantic segmentation system that efficiently handles high-resolution images. The system addresses the challenge of reducing computational cost while maintaining high-quality segmentation. ICNet incorporates multi-resolution branches under proper label guidance to achieve fast inference with decent accuracy. It uses a cascade feature fusion unit and cascade label guidance strategy to integrate medium and high-resolution features, refining the coarse semantic map gradually. The system achieves a 5× speedup in inference time and reduces memory consumption by 5×, enabling real-time performance on high-resolution images like 1024×2048 at 30 fps. ICNet is evaluated on challenging datasets such as Cityscapes, CamVid, and COCO-Stuff, achieving high-quality results. The system's architecture combines low-resolution processing efficiency with high-resolution inference quality, making it suitable for real-time applications. The paper also compares ICNet with existing methods, showing its effectiveness in balancing speed and accuracy. ICNet's design allows for efficient segmentation by leveraging different-resolution inputs and features, resulting in improved performance and reduced computational cost. The system is implemented using a cascade structure, with different branches handling varying resolutions, leading to better segmentation results. The paper concludes that ICNet provides a practical solution for real-time semantic segmentation on high-resolution images.This paper presents ICNet, a real-time semantic segmentation system that efficiently handles high-resolution images. The system addresses the challenge of reducing computational cost while maintaining high-quality segmentation. ICNet incorporates multi-resolution branches under proper label guidance to achieve fast inference with decent accuracy. It uses a cascade feature fusion unit and cascade label guidance strategy to integrate medium and high-resolution features, refining the coarse semantic map gradually. The system achieves a 5× speedup in inference time and reduces memory consumption by 5×, enabling real-time performance on high-resolution images like 1024×2048 at 30 fps. ICNet is evaluated on challenging datasets such as Cityscapes, CamVid, and COCO-Stuff, achieving high-quality results. The system's architecture combines low-resolution processing efficiency with high-resolution inference quality, making it suitable for real-time applications. The paper also compares ICNet with existing methods, showing its effectiveness in balancing speed and accuracy. ICNet's design allows for efficient segmentation by leveraging different-resolution inputs and features, resulting in improved performance and reduced computational cost. The system is implemented using a cascade structure, with different branches handling varying resolutions, leading to better segmentation results. The paper concludes that ICNet provides a practical solution for real-time semantic segmentation on high-resolution images.

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

20 Aug 2018 | Hengshuang Zhao, Xiaojuan Qi, Xiaoyong Shen, Jianping Shi, Jiaya Jia