| Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake
This paper presents a method for real-time human pose recognition in parts from single depth images. The approach uses a novel intermediate body parts representation to map the challenging pose estimation problem into a simpler per-pixel classification task. The method is trained on a large and diverse synthetic dataset, allowing the classifier to estimate body parts invariant to pose, body shape, clothing, etc. The system runs at 200 frames per second on consumer hardware and achieves high accuracy on both synthetic and real test sets. The algorithm generates confidence-scored 3D proposals for several body joints by reprojecting the classification result and finding local modes.
The method is based on a randomized decision forest classifier, which is trained on a large synthetic dataset generated from motion capture data. The classifier uses simple depth comparison features that are depth-invariant and efficient to compute. The algorithm is implemented on a GPU, allowing for fast processing. The system is evaluated on both real and synthetic depth images, showing high accuracy and robustness to varying body shapes and sizes. The results demonstrate that the method generalizes well to unseen poses and outperforms existing approaches in terms of accuracy and speed.
The method is compared with related work, including nearest-neighbor approaches and the state-of-the-art. The results show that the proposed method achieves state-of-the-art accuracy and is significantly faster than existing approaches. The method is also evaluated on a variety of challenging scenarios, including full rotations and multiple people. The results show that the method can accurately localize body joints even in complex scenarios. The method is also compared with a recent approach that uses a time-of-flight camera, showing that the proposed method is significantly faster and achieves higher accuracy. The results suggest that the method is effective in a wide range of scenarios and can be applied to various applications, including gaming, human-computer interaction, security, and healthcare.This paper presents a method for real-time human pose recognition in parts from single depth images. The approach uses a novel intermediate body parts representation to map the challenging pose estimation problem into a simpler per-pixel classification task. The method is trained on a large and diverse synthetic dataset, allowing the classifier to estimate body parts invariant to pose, body shape, clothing, etc. The system runs at 200 frames per second on consumer hardware and achieves high accuracy on both synthetic and real test sets. The algorithm generates confidence-scored 3D proposals for several body joints by reprojecting the classification result and finding local modes.
The method is based on a randomized decision forest classifier, which is trained on a large synthetic dataset generated from motion capture data. The classifier uses simple depth comparison features that are depth-invariant and efficient to compute. The algorithm is implemented on a GPU, allowing for fast processing. The system is evaluated on both real and synthetic depth images, showing high accuracy and robustness to varying body shapes and sizes. The results demonstrate that the method generalizes well to unseen poses and outperforms existing approaches in terms of accuracy and speed.
The method is compared with related work, including nearest-neighbor approaches and the state-of-the-art. The results show that the proposed method achieves state-of-the-art accuracy and is significantly faster than existing approaches. The method is also evaluated on a variety of challenging scenarios, including full rotations and multiple people. The results show that the method can accurately localize body joints even in complex scenarios. The method is also compared with a recent approach that uses a time-of-flight camera, showing that the proposed method is significantly faster and achieves higher accuracy. The results suggest that the method is effective in a wide range of scenarios and can be applied to various applications, including gaming, human-computer interaction, security, and healthcare.