25 Feb 2019 | Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang
This paper addresses the human pose estimation problem, focusing on learning reliable high-resolution representations. Unlike existing methods that recover high-resolution representations from low-resolution outputs, the proposed High-Resolution Net (HRNet) maintains high-resolution representations throughout the process. HRNet starts with a high-resolution subnetwork and gradually adds high-to-low resolution subnetworks in parallel, followed by repeated multi-scale fusions to enhance the high-resolution representations. This approach results in more accurate and spatially precise keypoint heatmaps. Empirical results on benchmark datasets (COCO and MPII) and PoseTrack demonstrate the effectiveness of HRNet, showing superior performance in keypoint detection and pose tracking. The code and models are publicly available.This paper addresses the human pose estimation problem, focusing on learning reliable high-resolution representations. Unlike existing methods that recover high-resolution representations from low-resolution outputs, the proposed High-Resolution Net (HRNet) maintains high-resolution representations throughout the process. HRNet starts with a high-resolution subnetwork and gradually adds high-to-low resolution subnetworks in parallel, followed by repeated multi-scale fusions to enhance the high-resolution representations. This approach results in more accurate and spatially precise keypoint heatmaps. Empirical results on benchmark datasets (COCO and MPII) and PoseTrack demonstrate the effectiveness of HRNet, showing superior performance in keypoint detection and pose tracking. The code and models are publicly available.