Articulated Human Detection with Flexible Mixtures of Parts

Articulated Human Detection with Flexible Mixtures of Parts

2013-12-01 | Yang, Yi; Ramanan, Deva
This paper presents a novel method for articulated human detection and pose estimation using a flexible mixture of parts. The approach introduces a new representation of deformable part models that captures spatial relations between part locations and co-occurrence relations between part mixtures. The model is efficient and can be used as a detector that searches over scales and image locations. It is trained using a structured SVM solver and outperforms previous approaches on standard benchmarks, including the Parse and Buffy datasets, while being orders of magnitude faster. The model uses a mixture of small, non-oriented parts to capture the dependency of global geometry on local appearance. It can efficiently model a large set of global mixtures through the composition of local mixtures. The model is optimized using dynamic programming when relations are tree structured. The paper also introduces new evaluation criteria for pose estimation and human detection that are self-consistent and address the limitations of previous criteria. The model is evaluated on two standard benchmark datasets, the Parse and Buffy datasets. The results show that the proposed method achieves state-of-the-art performance in pose estimation and human detection. The model is efficient enough to process a typical benchmark image in about one second, allowing for real-time performance with further speedups. The paper also discusses the effects of varying the number of parts (K) and mixtures (T) on the accuracy of pose estimation. It shows that increasing the number of parts and mixtures improves performance, likely due to the ability to model more orientations and foreshortening. The model is also compared to previous published results on the Parse dataset, showing that it outperforms all previous approaches. The paper introduces new evaluation criteria for pose estimation and human detection, including PCK and APK. These criteria are more accurate and provide a better understanding of the performance of the model. The results show that the proposed method outperforms all previous approaches on the Parse and Buffy datasets, and is significantly faster than other methods. The model is also compared to other approaches, including the state-of-the-art deformable part model, and shows superior performance. The paper concludes that the proposed method is the best approach for articulated human detection and pose estimation.This paper presents a novel method for articulated human detection and pose estimation using a flexible mixture of parts. The approach introduces a new representation of deformable part models that captures spatial relations between part locations and co-occurrence relations between part mixtures. The model is efficient and can be used as a detector that searches over scales and image locations. It is trained using a structured SVM solver and outperforms previous approaches on standard benchmarks, including the Parse and Buffy datasets, while being orders of magnitude faster. The model uses a mixture of small, non-oriented parts to capture the dependency of global geometry on local appearance. It can efficiently model a large set of global mixtures through the composition of local mixtures. The model is optimized using dynamic programming when relations are tree structured. The paper also introduces new evaluation criteria for pose estimation and human detection that are self-consistent and address the limitations of previous criteria. The model is evaluated on two standard benchmark datasets, the Parse and Buffy datasets. The results show that the proposed method achieves state-of-the-art performance in pose estimation and human detection. The model is efficient enough to process a typical benchmark image in about one second, allowing for real-time performance with further speedups. The paper also discusses the effects of varying the number of parts (K) and mixtures (T) on the accuracy of pose estimation. It shows that increasing the number of parts and mixtures improves performance, likely due to the ability to model more orientations and foreshortening. The model is also compared to previous published results on the Parse dataset, showing that it outperforms all previous approaches. The paper introduces new evaluation criteria for pose estimation and human detection, including PCK and APK. These criteria are more accurate and provide a better understanding of the performance of the model. The results show that the proposed method outperforms all previous approaches on the Parse and Buffy datasets, and is significantly faster than other methods. The model is also compared to other approaches, including the state-of-the-art deformable part model, and shows superior performance. The paper concludes that the proposed method is the best approach for articulated human detection and pose estimation.
Reach us at info@study.space
[slides and audio] Articulated Human Detection with Flexible Mixtures of Parts