GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

20 Nov 2019 | Lianghua Huang, Xin Zhao, Member, IEEE, and Kaiqi Huang, Senior Member, IEEE
The paper introduces GOT-10k, a large-scale benchmark dataset for generic object tracking in the wild. The dataset is built using the WordNet structure, covering over 560 classes of moving objects and 87 motion patterns, providing a wide and diverse range of real-world scenarios. Key contributions include: 1. **Dataset Construction**: GOT-10k contains over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and evaluation of deep trackers. 2. **Class Population**: It is the first dataset to use WordNet to guide class population, ensuring comprehensive and unbiased coverage. 3. **One-Shot Protocol**: Introduces a one-shot protocol for tracker evaluation, where training and test classes are zero-overlapped, avoiding biased evaluation results. 4. **Additional Labels**: Provides additional labels such as motion classes and object visible ratios, facilitating the development of motion-aware and occlusion-aware trackers. 5. **Extensive Experiments**: Conducts extensive tracking experiments with 39 typical tracking algorithms and analyzes their results. 6. **Comprehensive Platform**: Develops a platform offering full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The paper also discusses the construction of the dataset, including video collection, trajectory annotation, and dataset splitting. It evaluates the performance of various baseline models and analyzes the impact of different challenges and training data on tracking performance. The results highlight the difficulties in tracking under various conditions and the strengths and weaknesses of different trackers.The paper introduces GOT-10k, a large-scale benchmark dataset for generic object tracking in the wild. The dataset is built using the WordNet structure, covering over 560 classes of moving objects and 87 motion patterns, providing a wide and diverse range of real-world scenarios. Key contributions include: 1. **Dataset Construction**: GOT-10k contains over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and evaluation of deep trackers. 2. **Class Population**: It is the first dataset to use WordNet to guide class population, ensuring comprehensive and unbiased coverage. 3. **One-Shot Protocol**: Introduces a one-shot protocol for tracker evaluation, where training and test classes are zero-overlapped, avoiding biased evaluation results. 4. **Additional Labels**: Provides additional labels such as motion classes and object visible ratios, facilitating the development of motion-aware and occlusion-aware trackers. 5. **Extensive Experiments**: Conducts extensive tracking experiments with 39 typical tracking algorithms and analyzes their results. 6. **Comprehensive Platform**: Develops a platform offering full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The paper also discusses the construction of the dataset, including video collection, trajectory annotation, and dataset splitting. It evaluates the performance of various baseline models and analyzes the impact of different challenges and training data on tracking performance. The results highlight the difficulties in tracking under various conditions and the strengths and weaknesses of different trackers.
Reach us at info@study.space
[slides] GOT-10k%3A A Large High-Diversity Benchmark for Generic Object Tracking in the Wild | StudySpace