23 Jul 2016 | Amy Bearman, Olga Russakovsky, Vittorio Ferrari, and Li Fei-Fei
This paper introduces a new approach to semantic segmentation using point-level supervision, where annotators point to objects in images. The method combines point supervision with an objectness prior in the training loss function of a CNN, leading to significant improvements in segmentation accuracy. Experimental results on the PASCAL VOC 2012 benchmark show that point-level supervision improves mean intersection over union (mIOU) by 12.9% compared to image-level supervision. Additionally, models trained with point-level supervision outperform those trained with image-level, squiggle-level, or full supervision under the same annotation budget. The paper also demonstrates that point-level supervision is more efficient than full supervision, requiring only 1.1-1.2 times more annotation time than image-level labels and over 10 times less than full supervision. The method is evaluated on the PASCAL VOC 2012 dataset, and the results show that point-level supervision provides the best trade-off between annotation time and segmentation accuracy. The paper also explores the synergy between point-level supervision and an objectness prior, showing that their combination leads to a 13% improvement in mIOU. The study also compares different supervision methods, including point-level, squiggle-level, and full supervision, and finds that point-level supervision is the most effective under a fixed annotation budget. The paper concludes that point-level supervision is a promising approach for semantic segmentation, offering a balance between annotation cost and model accuracy.This paper introduces a new approach to semantic segmentation using point-level supervision, where annotators point to objects in images. The method combines point supervision with an objectness prior in the training loss function of a CNN, leading to significant improvements in segmentation accuracy. Experimental results on the PASCAL VOC 2012 benchmark show that point-level supervision improves mean intersection over union (mIOU) by 12.9% compared to image-level supervision. Additionally, models trained with point-level supervision outperform those trained with image-level, squiggle-level, or full supervision under the same annotation budget. The paper also demonstrates that point-level supervision is more efficient than full supervision, requiring only 1.1-1.2 times more annotation time than image-level labels and over 10 times less than full supervision. The method is evaluated on the PASCAL VOC 2012 dataset, and the results show that point-level supervision provides the best trade-off between annotation time and segmentation accuracy. The paper also explores the synergy between point-level supervision and an objectness prior, showing that their combination leads to a 13% improvement in mIOU. The study also compares different supervision methods, including point-level, squiggle-level, and full supervision, and finds that point-level supervision is the most effective under a fixed annotation budget. The paper concludes that point-level supervision is a promising approach for semantic segmentation, offering a balance between annotation cost and model accuracy.