Deep Learning for Detecting Robotic Grasps

Deep Learning for Detecting Robotic Grasps

21 Aug 2014 | Ian Lenz, Honglak Lee, Ashutosh Saxena
This paper presents a deep learning approach for robotic grasp detection in RGB-D scenes. The method addresses two main challenges: efficiently evaluating a large number of candidate grasps and handling multimodal inputs. A two-step cascaded system with two deep networks is proposed, where the first network quickly filters out unlikely grasps, and the second network refines the top candidates for accurate detection. A structured regularization method is introduced to handle multimodal data, encouraging features to use only a subset of input modalities. The method is tested on an RGB-D grasping dataset and implemented on two robotic platforms (Baxter and PR2), achieving success rates of 84% and 89%, respectively. The approach improves both recognition and detection performance, outperforming state-of-the-art methods for rectangle-based grasp detection. The system uses oriented rectangles in image space to represent potential grasps, with features extracted from RGB-D data and surface normals. The algorithm is trained using a combination of unsupervised feature learning and supervised learning, with structured regularization to enhance feature robustness. Experiments show that the method significantly improves grasp detection accuracy and reduces computational time compared to traditional approaches. The results demonstrate the effectiveness of deep learning in robotic grasping, enabling the system to generalize to new object classes and handle complex environments.This paper presents a deep learning approach for robotic grasp detection in RGB-D scenes. The method addresses two main challenges: efficiently evaluating a large number of candidate grasps and handling multimodal inputs. A two-step cascaded system with two deep networks is proposed, where the first network quickly filters out unlikely grasps, and the second network refines the top candidates for accurate detection. A structured regularization method is introduced to handle multimodal data, encouraging features to use only a subset of input modalities. The method is tested on an RGB-D grasping dataset and implemented on two robotic platforms (Baxter and PR2), achieving success rates of 84% and 89%, respectively. The approach improves both recognition and detection performance, outperforming state-of-the-art methods for rectangle-based grasp detection. The system uses oriented rectangles in image space to represent potential grasps, with features extracted from RGB-D data and surface normals. The algorithm is trained using a combination of unsupervised feature learning and supervised learning, with structured regularization to enhance feature robustness. Experiments show that the method significantly improves grasp detection accuracy and reduces computational time compared to traditional approaches. The results demonstrate the effectiveness of deep learning in robotic grasping, enabling the system to generalize to new object classes and handle complex environments.
Reach us at info@study.space
[slides and audio] Deep learning for detecting robotic grasps