[slides and audio] Introducing HOT3D%3A An Egocentric Dataset for 3D Hand and Object Tracking

**HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking** **Introduction:** HOT3D is a publicly available dataset designed for egocentric hand and object tracking in 3D. It includes over 833 minutes of multi-view RGB/monochrome image streams, featuring 19 subjects interacting with 33 diverse rigid objects. The dataset offers multi-modal signals such as eye gaze and scene point clouds, along with comprehensive ground-truth annotations, including 3D poses of objects, hands, and cameras. Recorded using Meta's Project Aria and Quest 3 devices, the dataset provides high-quality 3D models of hands and objects, enabling realistic training images. **Dataset Details:** - **Recordings:** Over 1.5M multi-view frames (3.7M images) from Project Aria and Quest 3. - **Subjects:** 19 diverse participants with different hand shapes and nationalities. - **Objects:** 33 objects with high-resolution 3D models and PBR materials. - **Scenarios:** Typical actions in kitchen, office, and living room environments. - **Annotations:** Per-frame ground-truth poses of hands and objects, available for the training split. **Additional Features:** - **Curated Clips:** 4117 clips for benchmarking tracking and pose estimation methods. - **Object-Onboarding Sequences:** Two types of sequences for model-free object tracking and 3D object reconstruction. - **Public Challenges:** Co-organized with ECCV 2024, focusing on BOP Challenge 2024 and Hand Tracking Challenge 2024. **References:** The dataset and its collection process are detailed in the paper, with references to related work and technical documentation available on the project website.**HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking** **Introduction:** HOT3D is a publicly available dataset designed for egocentric hand and object tracking in 3D. It includes over 833 minutes of multi-view RGB/monochrome image streams, featuring 19 subjects interacting with 33 diverse rigid objects. The dataset offers multi-modal signals such as eye gaze and scene point clouds, along with comprehensive ground-truth annotations, including 3D poses of objects, hands, and cameras. Recorded using Meta's Project Aria and Quest 3 devices, the dataset provides high-quality 3D models of hands and objects, enabling realistic training images. **Dataset Details:** - **Recordings:** Over 1.5M multi-view frames (3.7M images) from Project Aria and Quest 3. - **Subjects:** 19 diverse participants with different hand shapes and nationalities. - **Objects:** 33 objects with high-resolution 3D models and PBR materials. - **Scenarios:** Typical actions in kitchen, office, and living room environments. - **Annotations:** Per-frame ground-truth poses of hands and objects, available for the training split. **Additional Features:** - **Curated Clips:** 4117 clips for benchmarking tracking and pose estimation methods. - **Object-Onboarding Sequences:** Two types of sequences for model-free object tracking and 3D object reconstruction. - **Public Challenges:** Co-organized with ECCV 2024, focusing on BOP Challenge 2024 and Hand Tracking Challenge 2024. **References:** The dataset and its collection process are detailed in the paper, with references to related work and technical documentation available on the project website.

Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking

13 Jun 2024 | Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali, Fan Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe, Robert Wang, Jakob Julian Engel, Tomas Hodan