RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

21 Feb 2024 | Yumeng Liu, Yaxun Yang, Youzhuo Wang, Xiaofei Wu, Jiamin Wang, Yichen Yao, Sören Schwertfeger, Sibe Yang, Wenping Wang, Jingyi Yu, Xuming He, Yuexin Ma
RealDex is a new dataset capturing real-world dexterous hand grasping motions with human-like behavior, enriched by multi-view and multimodal visual data. The dataset was collected using a teleoperation system that synchronizes human and robotic hand poses in real time. It includes 52 objects with diverse shapes, sizes, and materials, 2.6k grasping motion sequences, and approximately 955k frames of visual data. RealDex provides a comprehensive dataset for training dexterous hands to mimic human movements more naturally and precisely, with ground truth data for real dexterous hand motions and multi-view, multimodal visual data for developing vision-based manipulation algorithms. The paper introduces a novel dexterous grasping motion generation framework that uses Multimodal Large Language Models (MLLMs) to select the most natural, physically plausible, and human-like grasp poses. The framework involves two key stages: grasp pose generation and motion synthesis. The grasp pose generation uses a conditional Variational Autoencoder (cVAE) to generate candidate poses and aligns them with human preferences through MLLMs. The motion synthesis predicts complete hand motion sequences for each pose using auto-regressive motion trajectory prediction. Extensive experiments on RealDex and other open datasets demonstrate the superior performance of the proposed method in generating human-like dexterous grasping motions. The dataset and code will be made available upon publication. RealDex is significant for robotic dexterous hand-object grasping as it replicates human hand-object interactions, provides precise ground truth data for real dexterous hand motions, and offers multi-view, multimodal visual data for developing vision-based manipulation algorithms.RealDex is a new dataset capturing real-world dexterous hand grasping motions with human-like behavior, enriched by multi-view and multimodal visual data. The dataset was collected using a teleoperation system that synchronizes human and robotic hand poses in real time. It includes 52 objects with diverse shapes, sizes, and materials, 2.6k grasping motion sequences, and approximately 955k frames of visual data. RealDex provides a comprehensive dataset for training dexterous hands to mimic human movements more naturally and precisely, with ground truth data for real dexterous hand motions and multi-view, multimodal visual data for developing vision-based manipulation algorithms. The paper introduces a novel dexterous grasping motion generation framework that uses Multimodal Large Language Models (MLLMs) to select the most natural, physically plausible, and human-like grasp poses. The framework involves two key stages: grasp pose generation and motion synthesis. The grasp pose generation uses a conditional Variational Autoencoder (cVAE) to generate candidate poses and aligns them with human preferences through MLLMs. The motion synthesis predicts complete hand motion sequences for each pose using auto-regressive motion trajectory prediction. Extensive experiments on RealDex and other open datasets demonstrate the superior performance of the proposed method in generating human-like dexterous grasping motions. The dataset and code will be made available upon publication. RealDex is significant for robotic dexterous hand-object grasping as it replicates human hand-object interactions, provides precise ground truth data for real dexterous hand motions, and offers multi-view, multimodal visual data for developing vision-based manipulation algorithms.
Reach us at info@study.space
Understanding RealDex%3A Towards Human-like Grasping for Robotic Dexterous Hand