2014 | Saurabh Gupta, Ross Girshick, Pablo Arbeláez, Jitendra Malik
This paper addresses the problem of object detection and segmentation in RGB-D images, proposing a new geocentric embedding for depth images that encodes height above ground, angle with gravity, and horizontal disparity. The authors demonstrate that this embedding outperforms raw depth images in learning feature representations with convolutional neural networks. Their final object detection system achieves an average precision of 37.3%, a 56% relative improvement over existing methods. For instance segmentation, they propose a decision forest approach that classifies pixels in detection windows as foreground or background using shape and geocentric pose features. Additionally, they enhance semantic scene segmentation by using object detections to compute additional features for superpixels, achieving a 24% relative improvement over state-of-the-art methods. The paper also discusses related work and provides experimental results on the NYUD2 dataset, showing significant improvements in contour detection, region proposal quality, and object detection performance.This paper addresses the problem of object detection and segmentation in RGB-D images, proposing a new geocentric embedding for depth images that encodes height above ground, angle with gravity, and horizontal disparity. The authors demonstrate that this embedding outperforms raw depth images in learning feature representations with convolutional neural networks. Their final object detection system achieves an average precision of 37.3%, a 56% relative improvement over existing methods. For instance segmentation, they propose a decision forest approach that classifies pixels in detection windows as foreground or background using shape and geocentric pose features. Additionally, they enhance semantic scene segmentation by using object detections to compute additional features for superpixels, achieving a 24% relative improvement over state-of-the-art methods. The paper also discusses related work and provides experimental results on the NYUD2 dataset, showing significant improvements in contour detection, region proposal quality, and object detection performance.