DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

15 Jan 2019 | Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese
DenseFusion is a novel framework for estimating 6D object pose from RGB-D images. It combines color and depth information through a dense fusion network, enabling accurate and real-time pose estimation. The framework processes RGB and depth data separately, then fuses them at the pixel level to extract dense feature embeddings. An end-to-end iterative refinement procedure further improves pose estimation while maintaining real-time performance. DenseFusion outperforms state-of-the-art methods on the YCB-Video and LineMOD datasets, achieving higher accuracy and faster inference. It is also deployed on a real robot for object grasping and manipulation tasks. The method is robust to heavy occlusion and segmentation errors, and it demonstrates strong performance in real-world applications. The key contributions include a principled way to combine color and depth information and an iterative refinement procedure that enhances model performance without relying on post-processing steps. The model is evaluated on two benchmark datasets, showing significant improvements in pose estimation accuracy and efficiency. The method is also shown to be effective in robotic grasping tasks, demonstrating its practical utility.DenseFusion is a novel framework for estimating 6D object pose from RGB-D images. It combines color and depth information through a dense fusion network, enabling accurate and real-time pose estimation. The framework processes RGB and depth data separately, then fuses them at the pixel level to extract dense feature embeddings. An end-to-end iterative refinement procedure further improves pose estimation while maintaining real-time performance. DenseFusion outperforms state-of-the-art methods on the YCB-Video and LineMOD datasets, achieving higher accuracy and faster inference. It is also deployed on a real robot for object grasping and manipulation tasks. The method is robust to heavy occlusion and segmentation errors, and it demonstrates strong performance in real-world applications. The key contributions include a principled way to combine color and depth information and an iterative refinement procedure that enhances model performance without relying on post-processing steps. The model is evaluated on two benchmark datasets, showing significant improvements in pose estimation accuracy and efficiency. The method is also shown to be effective in robotic grasping tasks, demonstrating its practical utility.
Reach us at info@study.space