The Open Images Dataset V4 Unified image classification, object detection, and visual relationship detection at scale

The Open Images Dataset V4 Unified image classification, object detection, and visual relationship detection at scale

21 Feb 2020 | Alina Kuznetsova · Hassan Rom · Neil Alldrin · Jasper Uijlings · Ivan Krasin · Jordi Pont-Tuset · Shahab Kamali · Stefan Popov · Matteo Malloci · Alexander Kolesnikov · Tom Duerig · Vittorio Ferrari
The Open Images Dataset V4 contains 9.2 million images with unified annotations for image classification, object detection, and visual relationship detection. The images are collected from Flickr without predefined class names or tags, leading to natural class statistics and avoiding initial design bias. The dataset offers large-scale annotations: 30.1 million image-level labels for 19,800 concepts, 15.4 million bounding boxes for 600 object classes, and 375,000 visual relationship annotations involving 57 classes. It provides 15 times more bounding boxes than the next largest datasets, with an average of 8 annotated objects per image. The dataset includes comprehensive statistics, quality validation, and performance studies of modern models. It also supports two applications: fine-grained object detection without fine-grained box labels and zero-shot visual relationship detection. The dataset is unified, allowing cross-task training and analysis. It is released under a Creative Commons Attribution license, enabling commercial use. The dataset is large, diverse, and high-quality, making it ideal for pushing the limits of data-hungry methods in computer vision. It supports research in image classification, object detection, and visual relationship detection, and has applications in scene understanding. The dataset includes detailed statistics, validation of annotations, and performance analysis of models. It is the first of its kind, offering a unified set of annotations for multiple tasks in the same images.The Open Images Dataset V4 contains 9.2 million images with unified annotations for image classification, object detection, and visual relationship detection. The images are collected from Flickr without predefined class names or tags, leading to natural class statistics and avoiding initial design bias. The dataset offers large-scale annotations: 30.1 million image-level labels for 19,800 concepts, 15.4 million bounding boxes for 600 object classes, and 375,000 visual relationship annotations involving 57 classes. It provides 15 times more bounding boxes than the next largest datasets, with an average of 8 annotated objects per image. The dataset includes comprehensive statistics, quality validation, and performance studies of modern models. It also supports two applications: fine-grained object detection without fine-grained box labels and zero-shot visual relationship detection. The dataset is unified, allowing cross-task training and analysis. It is released under a Creative Commons Attribution license, enabling commercial use. The dataset is large, diverse, and high-quality, making it ideal for pushing the limits of data-hungry methods in computer vision. It supports research in image classification, object detection, and visual relationship detection, and has applications in scene understanding. The dataset includes detailed statistics, validation of annotations, and performance analysis of models. It is the first of its kind, offering a unified set of annotations for multiple tasks in the same images.
Reach us at info@study.space
[slides and audio] The Open Images Dataset V4