The Open Images Dataset V4 Unified image classification, object detection, and visual relationship detection at scale

The Open Images Dataset V4 Unified image classification, object detection, and visual relationship detection at scale

21 Feb 2020 | Alina Kuznetsova · Hassan Rom · Neil Alldrin · Jasper Uijlings · Ivan Krasin · Jordi Pont-Tuset · Shahab Kamali · Stefan Popov · Matteo Malloci · Alexander Kolesnikov · Tom Duerig · Vittorio Ferrari
The Open Images Dataset V4 is a comprehensive resource for image classification, object detection, and visual relationship detection, featuring 9.2 million images with unified annotations. The dataset is released under a Creative Commons Attribution license, allowing for broad sharing and adaptation. Key features include: - **Scale**: 30.1 million image-level labels for 19,800 concepts, 15.4 million bounding boxes for 600 object classes, and 375,000 visual relationship annotations involving 57 classes. - **Complexity**: Images often contain multiple objects (average of 8 annotated objects per image), making it ideal for advanced detection models. - **Unified Annotations**: Annotations for image classification, object detection, and visual relationship detection coexist in the same images, enabling cross-task training and analysis. - **Quality**: In-depth statistics and validation of annotation quality, including geometric accuracy of bounding boxes and recall of image-level annotations. - **Applications**: Demonstrates two applications: fine-grained object detection without fine-grained box labels and zero-shot visual relationship detection. The dataset was collected from Flickr, avoiding predefined class names or tags, leading to natural class statistics and reducing bias. The acquisition process involved identifying CC-BY licensed images, removing those appearing elsewhere on the internet, and ensuring a high proportion of complex images with multiple objects. The dataset is available for research and innovation in computer vision, particularly in areas requiring structured reasoning and multi-type annotations.The Open Images Dataset V4 is a comprehensive resource for image classification, object detection, and visual relationship detection, featuring 9.2 million images with unified annotations. The dataset is released under a Creative Commons Attribution license, allowing for broad sharing and adaptation. Key features include: - **Scale**: 30.1 million image-level labels for 19,800 concepts, 15.4 million bounding boxes for 600 object classes, and 375,000 visual relationship annotations involving 57 classes. - **Complexity**: Images often contain multiple objects (average of 8 annotated objects per image), making it ideal for advanced detection models. - **Unified Annotations**: Annotations for image classification, object detection, and visual relationship detection coexist in the same images, enabling cross-task training and analysis. - **Quality**: In-depth statistics and validation of annotation quality, including geometric accuracy of bounding boxes and recall of image-level annotations. - **Applications**: Demonstrates two applications: fine-grained object detection without fine-grained box labels and zero-shot visual relationship detection. The dataset was collected from Flickr, avoiding predefined class names or tags, leading to natural class statistics and reducing bias. The acquisition process involved identifying CC-BY licensed images, removing those appearing elsewhere on the internet, and ensuring a high proportion of complex images with multiple objects. The dataset is available for research and innovation in computer vision, particularly in areas requiring structured reasoning and multi-type annotations.
Reach us at info@study.space