17 Apr 2014 | Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, Ying Wu
This paper introduces a deep ranking model for learning fine-grained image similarity, which is crucial for search-by-example applications. Traditional methods often rely on category-level image similarity, which is insufficient for distinguishing differences within the same category. The authors propose a deep ranking model that directly learns from images, using triplets (query, positive, negative) to characterize fine-grained similarity relationships. The model employs a hinge loss function to enforce the correct ranking order of triplets and a multiscale neural network architecture to capture both global visual properties and image semantics. An efficient online triplet sampling algorithm is developed to handle large datasets, and the model is evaluated on a human-labeled dataset with high-quality triplet samples. The results show that the deep ranking model outperforms both hand-crafted feature-based and deep classification models in terms of similarity precision and score-at-top-30 metrics. The paper also discusses the impact of different network structures and sampling methods, demonstrating the effectiveness of the proposed approach.This paper introduces a deep ranking model for learning fine-grained image similarity, which is crucial for search-by-example applications. Traditional methods often rely on category-level image similarity, which is insufficient for distinguishing differences within the same category. The authors propose a deep ranking model that directly learns from images, using triplets (query, positive, negative) to characterize fine-grained similarity relationships. The model employs a hinge loss function to enforce the correct ranking order of triplets and a multiscale neural network architecture to capture both global visual properties and image semantics. An efficient online triplet sampling algorithm is developed to handle large datasets, and the model is evaluated on a human-labeled dataset with high-quality triplet samples. The results show that the deep ranking model outperforms both hand-crafted feature-based and deep classification models in terms of similarity precision and score-at-top-30 metrics. The paper also discusses the impact of different network structures and sampling methods, demonstrating the effectiveness of the proposed approach.