24 Jan 2018 | Kaiming He Georgia Gkioxari Piotr Dollár Ross Girshick
Mask R-CNN is a simple, flexible, and efficient framework for instance segmentation that extends Faster R-CNN by adding a branch for predicting object masks in parallel with the existing branch for bounding box recognition. This approach allows for accurate detection and segmentation of objects in images. Mask R-CNN outperforms existing methods on the COCO dataset, including the 2016 challenge winners, in all three tracks: instance segmentation, bounding-box object detection, and person keypoint detection. It is also effective for human pose estimation, demonstrating the framework's versatility. The method is easy to train and runs at 5 fps, with a small computational overhead. Mask R-CNN uses RoIAlign to improve spatial alignment, which significantly enhances mask accuracy. The framework is compatible with various architectures and can be extended to other tasks. The code is available at https://github.com/facebookresearch/Detectron.Mask R-CNN is a simple, flexible, and efficient framework for instance segmentation that extends Faster R-CNN by adding a branch for predicting object masks in parallel with the existing branch for bounding box recognition. This approach allows for accurate detection and segmentation of objects in images. Mask R-CNN outperforms existing methods on the COCO dataset, including the 2016 challenge winners, in all three tracks: instance segmentation, bounding-box object detection, and person keypoint detection. It is also effective for human pose estimation, demonstrating the framework's versatility. The method is easy to train and runs at 5 fps, with a small computational overhead. Mask R-CNN uses RoIAlign to improve spatial alignment, which significantly enhances mask accuracy. The framework is compatible with various architectures and can be extended to other tasks. The code is available at https://github.com/facebookresearch/Detectron.