Understanding Objaverse%3A A Universe of Annotated 3D Objects

**Objaverse: A Universe of Annotated 3D Objects** **Authors:** Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, Ali Farhadi **Institution:** PRIOR @ Allen Institute for AI, University of Washington, Seattle **Abstract:** Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have driven significant progress in AI. However, large-scale datasets of high-fidelity 3D models remain limited in scale and diversity. To address this gap, we present Objaverse 1.0, a large dataset of over 800K 3D models with descriptive captions, tags, and animations. Objaverse improves upon existing 3D repositories in terms of scale, category diversity, and visual diversity within categories. We demonstrate the potential of Objaverse for four applications: training generative 3D models, improving instance segmentation on the LVIS benchmark, training open-vocabulary object-navigation models for Embodied AI, and creating a benchmark for robustness analysis of vision models. These applications showcase the broad impact of Objaverse on research and new applications in AI. **Introduction:** The lack of large-scale 3D datasets has hindered advancements in 3D vision. Unlike large-scale 2D datasets, current 3D datasets lack scale, diversity, and realism. Objaverse 1.0 addresses this gap by providing a large-scale, high-quality, and richly annotated 3D object dataset. The dataset includes over 800K 3D assets sourced from Sketchfab, covering a wide range of categories and visual styles. We demonstrate the utility of Objaverse through four applications: 3D generative modeling, instance segmentation with CP3D, open-vocabulary ObjectNav, and analyzing robustness in computer vision models. **Related Work:** Large-scale datasets have significantly improved model performance in various computer vision tasks. However, existing 3D datasets fall short in terms of scale, diversity, and realism compared to 2D datasets. While some datasets like KIT, YCB, and Pix3D provide diverse object categories, they lack scale and realism. Other datasets like GSO, PhotoShape, and 3D-Future offer realistic objects but are limited in scale and diversity. **Objaverse:** Objaverse is a comprehensive 3D asset library with over 818K high-quality, diverse 3D models. The dataset includes detailed metadata such as names, categories, tags, and natural language descriptions. We construct a subset of 47K LVIS categorized objects to enhance instance segmentation performance on the LVIS dataset. We also introduce 3DCP, a copy-and-paste technique for segmentation augmentation. For open-vocabulary ObjectNav**Objaverse: A Universe of Annotated 3D Objects** **Authors:** Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, Ali Farhadi **Institution:** PRIOR @ Allen Institute for AI, University of Washington, Seattle **Abstract:** Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have driven significant progress in AI. However, large-scale datasets of high-fidelity 3D models remain limited in scale and diversity. To address this gap, we present Objaverse 1.0, a large dataset of over 800K 3D models with descriptive captions, tags, and animations. Objaverse improves upon existing 3D repositories in terms of scale, category diversity, and visual diversity within categories. We demonstrate the potential of Objaverse for four applications: training generative 3D models, improving instance segmentation on the LVIS benchmark, training open-vocabulary object-navigation models for Embodied AI, and creating a benchmark for robustness analysis of vision models. These applications showcase the broad impact of Objaverse on research and new applications in AI. **Introduction:** The lack of large-scale 3D datasets has hindered advancements in 3D vision. Unlike large-scale 2D datasets, current 3D datasets lack scale, diversity, and realism. Objaverse 1.0 addresses this gap by providing a large-scale, high-quality, and richly annotated 3D object dataset. The dataset includes over 800K 3D assets sourced from Sketchfab, covering a wide range of categories and visual styles. We demonstrate the utility of Objaverse through four applications: 3D generative modeling, instance segmentation with CP3D, open-vocabulary ObjectNav, and analyzing robustness in computer vision models. **Related Work:** Large-scale datasets have significantly improved model performance in various computer vision tasks. However, existing 3D datasets fall short in terms of scale, diversity, and realism compared to 2D datasets. While some datasets like KIT, YCB, and Pix3D provide diverse object categories, they lack scale and realism. Other datasets like GSO, PhotoShape, and 3D-Future offer realistic objects but are limited in scale and diversity. **Objaverse:** Objaverse is a comprehensive 3D asset library with over 818K high-quality, diverse 3D models. The dataset includes detailed metadata such as names, categories, tags, and natural language descriptions. We construct a subset of 47K LVIS categorized objects to enhance instance segmentation performance on the LVIS dataset. We also introduce 3DCP, a copy-and-paste technique for segmentation augmentation. For open-vocabulary ObjectNav

Objaverse: A Universe of Annotated 3D Objects

15 Dec 2022 | Matt Deitke†ψ, Dustin Schwenk†, Jordi Salvador†, Luca Weihs†, Oscar Michel† Eli VanderBilt†, Ludwig Schmidtψ, Kiana Ehsani†, Aniruddha Kembhavi†ψ, Ali Farhadiψ