Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

11 Apr 2019 | Georgios Pavlakos*1,2, Vasileios Choutas*1, Nima Ghorbani1, Timo Bolkart1, Ahmed A. A. Osman1, Dimitrios Tzionas1, and Michael J. Black1
The paper presents a method to capture 3D human body pose, hand pose, and facial expression from a single monocular image. The authors develop a new unified 3D model called SMPL-X, which extends the SMPL model to include fully articulated hands and an expressive face. To fit SMPL-X to images, they propose SMPLify-X, an optimization-based approach that estimates 2D features and then fits the model parameters to these features. Key contributions include: 1. **2D Feature Detection**: They detect 2D features corresponding to the face, hands, and feet. 2. **Neural Network Pose Prior**: They train a new neural network pose prior using a large MoCap dataset. 3. **Interpenetration Penalty**: They define a new interpenetration penalty that is both fast and accurate. 4. **Gender Detection**: They automatically detect gender and use appropriate body models (male, female, or neutral). 5. **Efficient Implementation**: They achieve a speedup of more than 8× over Chumpy using PyTorch. The method is evaluated on a new curated dataset with pseudo ground-truth, showing significant improvements over related models. The authors believe this work is a significant step towards expressive capture of bodies, hands, and faces from a single RGB image. The models, code, and data are available for research purposes.The paper presents a method to capture 3D human body pose, hand pose, and facial expression from a single monocular image. The authors develop a new unified 3D model called SMPL-X, which extends the SMPL model to include fully articulated hands and an expressive face. To fit SMPL-X to images, they propose SMPLify-X, an optimization-based approach that estimates 2D features and then fits the model parameters to these features. Key contributions include: 1. **2D Feature Detection**: They detect 2D features corresponding to the face, hands, and feet. 2. **Neural Network Pose Prior**: They train a new neural network pose prior using a large MoCap dataset. 3. **Interpenetration Penalty**: They define a new interpenetration penalty that is both fast and accurate. 4. **Gender Detection**: They automatically detect gender and use appropriate body models (male, female, or neutral). 5. **Efficient Implementation**: They achieve a speedup of more than 8× over Chumpy using PyTorch. The method is evaluated on a new curated dataset with pseudo ground-truth, showing significant improvements over related models. The authors believe this work is a significant step towards expressive capture of bodies, hands, and faces from a single RGB image. The models, code, and data are available for research purposes.
Reach us at info@study.space