This paper introduces Multi-task Network Cascades (MNCs) for instance-aware semantic segmentation, addressing the challenge of identifying individual object instances in images. MNCs consist of three cascaded networks: one for differentiating instances, one for estimating masks, and one for categorizing objects. These networks share convolutional features, and the paper develops an end-to-end training algorithm to handle the causal dependencies between the stages. The method achieves state-of-the-art accuracy on the PASCAL VOC dataset, with a testing speed of 360ms per image, and surpasses previous systems in both accuracy and speed. Additionally, the method also performs well in object detection tasks, outperforming Fast/Faster R-CNN systems. The authors won 1st place in the MS COCO 2015 segmentation competition using MNCs.This paper introduces Multi-task Network Cascades (MNCs) for instance-aware semantic segmentation, addressing the challenge of identifying individual object instances in images. MNCs consist of three cascaded networks: one for differentiating instances, one for estimating masks, and one for categorizing objects. These networks share convolutional features, and the paper develops an end-to-end training algorithm to handle the causal dependencies between the stages. The method achieves state-of-the-art accuracy on the PASCAL VOC dataset, with a testing speed of 360ms per image, and surpasses previous systems in both accuracy and speed. Additionally, the method also performs well in object detection tasks, outperforming Fast/Faster R-CNN systems. The authors won 1st place in the MS COCO 2015 segmentation competition using MNCs.