27 Sep 2019 | Nikos Kolotouros*1, Georgios Pavlakos*1, Michael J. Black2, Kostas Daniilidis1
This paper introduces SPIN, a novel approach for 3D human pose and shape estimation that combines regression and optimization methods in a self-improving loop. The key idea is to use a deep network to regress model parameters, which then initializes an iterative optimization process to fit the model to 2D joints. The optimized parameters are then used to supervise the network, creating a feedback loop that improves both the network and the optimization process. This collaboration allows the network to benefit from accurate model fits, while the optimization process benefits from the network's initial estimates, leading to faster and more accurate results.
SPIN is trained using a tight collaboration between a regression-based approach and an iterative optimization-based approach. The regression network predicts the parameters of the SMPL model, which are then used to initialize the optimization routine. The optimized parameters are used to supervise the network, enabling it to learn better model parameters. This self-improving loop allows the network to continuously refine its estimates, leading to better performance.
The approach is particularly effective in scenarios where 3D ground truth is scarce or unavailable. It outperforms state-of-the-art model-based pose estimation methods by significant margins across various benchmarks and datasets. The method is also applicable in cases where no image with corresponding 3D ground truth is available, as the optimization module provides the necessary model-based supervision.
The paper also discusses related work, including optimization-based and regression-based methods for 3D human pose estimation. It highlights the strengths and weaknesses of each approach and argues for a collaborative strategy that leverages the strengths of both. The proposed approach, SPIN, demonstrates the effectiveness of this collaboration, achieving state-of-the-art results in 3D human pose and shape estimation. The method is implemented with detailed technical details, including the use of a deep neural network for regression, an iterative optimization routine for fitting the model, and a self-improving loop that enhances both the network and the optimization process. The results show that SPIN outperforms previous approaches in terms of accuracy and efficiency, making it a promising method for 3D human pose and shape estimation.This paper introduces SPIN, a novel approach for 3D human pose and shape estimation that combines regression and optimization methods in a self-improving loop. The key idea is to use a deep network to regress model parameters, which then initializes an iterative optimization process to fit the model to 2D joints. The optimized parameters are then used to supervise the network, creating a feedback loop that improves both the network and the optimization process. This collaboration allows the network to benefit from accurate model fits, while the optimization process benefits from the network's initial estimates, leading to faster and more accurate results.
SPIN is trained using a tight collaboration between a regression-based approach and an iterative optimization-based approach. The regression network predicts the parameters of the SMPL model, which are then used to initialize the optimization routine. The optimized parameters are used to supervise the network, enabling it to learn better model parameters. This self-improving loop allows the network to continuously refine its estimates, leading to better performance.
The approach is particularly effective in scenarios where 3D ground truth is scarce or unavailable. It outperforms state-of-the-art model-based pose estimation methods by significant margins across various benchmarks and datasets. The method is also applicable in cases where no image with corresponding 3D ground truth is available, as the optimization module provides the necessary model-based supervision.
The paper also discusses related work, including optimization-based and regression-based methods for 3D human pose estimation. It highlights the strengths and weaknesses of each approach and argues for a collaborative strategy that leverages the strengths of both. The proposed approach, SPIN, demonstrates the effectiveness of this collaboration, achieving state-of-the-art results in 3D human pose and shape estimation. The method is implemented with detailed technical details, including the use of a deep neural network for regression, an iterative optimization routine for fitting the model, and a self-improving loop that enhances both the network and the optimization process. The results show that SPIN outperforms previous approaches in terms of accuracy and efficiency, making it a promising method for 3D human pose and shape estimation.