NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

11 Apr 2016 | Amir Shahroudy†‡, Jun Liu†, Tian-Tsong Ng‡, Gang Wang†,*
The NTU RGB+D dataset is a large-scale dataset for 3D human activity analysis, containing over 56,000 video samples and 4 million frames, collected from 40 distinct subjects. It includes 60 action classes, including daily, mutual, and health-related actions. The dataset provides RGB, depth, skeleton, and infrared data, captured from 80 different camera viewpoints. It addresses limitations of existing benchmarks by offering more samples, action classes, and camera views. The dataset enables the application of data-hungry learning techniques for 3D human activity analysis. The paper introduces a new recurrent neural network structure, a part-aware LSTM, to model long-term temporal correlations of body part features for better action classification. Experimental results show that deep learning methods outperform hand-crafted features on cross-subject and cross-view evaluation criteria. The dataset also includes a detailed description of the dataset structure, evaluation criteria, and benchmarking methods. The proposed part-aware LSTM model splits the memory cell into part-based sub-cells, allowing the network to learn long-term patterns specific to each body part. The model outperforms other methods in both cross-subject and cross-view evaluations, achieving 62.93% and 70.27% accuracy, respectively. The paper concludes that the NTU RGB+D dataset enables the application of data-driven learning methods for 3D human activity analysis and validates the effectiveness of the proposed part-aware LSTM model.The NTU RGB+D dataset is a large-scale dataset for 3D human activity analysis, containing over 56,000 video samples and 4 million frames, collected from 40 distinct subjects. It includes 60 action classes, including daily, mutual, and health-related actions. The dataset provides RGB, depth, skeleton, and infrared data, captured from 80 different camera viewpoints. It addresses limitations of existing benchmarks by offering more samples, action classes, and camera views. The dataset enables the application of data-hungry learning techniques for 3D human activity analysis. The paper introduces a new recurrent neural network structure, a part-aware LSTM, to model long-term temporal correlations of body part features for better action classification. Experimental results show that deep learning methods outperform hand-crafted features on cross-subject and cross-view evaluation criteria. The dataset also includes a detailed description of the dataset structure, evaluation criteria, and benchmarking methods. The proposed part-aware LSTM model splits the memory cell into part-based sub-cells, allowing the network to learn long-term patterns specific to each body part. The model outperforms other methods in both cross-subject and cross-view evaluations, achieving 62.93% and 70.27% accuracy, respectively. The paper concludes that the NTU RGB+D dataset enables the application of data-driven learning methods for 3D human activity analysis and validates the effectiveness of the proposed part-aware LSTM model.
Reach us at info@study.space
[slides] NTU RGB%2BD%3A A Large Scale Dataset for 3D Human Activity Analysis | StudySpace