[slides and audio] Visual Relationship Detection with Language Priors

The paper "Visual Relationship Detection with Language Priors" addresses the challenge of detecting and localizing multiple relationships between objects in images. The authors propose a model that leverages the frequent occurrence of objects and predicates independently to train visual models for objects and predicates individually, and then combines them to predict multiple relationships per image. This approach scales to a large number of relationship types with few training examples. The model uses language priors from semantic word embeddings to fine-tune the likelihood of predicted relationships, enabling zero-shot relationship detection. The authors introduce a new dataset with 5000 images and 37,993 relationships, demonstrating that their model outperforms previous methods in visual relationship detection and zero-shot learning. Additionally, they show that understanding visual relationships can improve content-based image retrieval. The paper includes detailed experiments, ablation studies, and qualitative results to support the effectiveness of their approach.The paper "Visual Relationship Detection with Language Priors" addresses the challenge of detecting and localizing multiple relationships between objects in images. The authors propose a model that leverages the frequent occurrence of objects and predicates independently to train visual models for objects and predicates individually, and then combines them to predict multiple relationships per image. This approach scales to a large number of relationship types with few training examples. The model uses language priors from semantic word embeddings to fine-tune the likelihood of predicted relationships, enabling zero-shot relationship detection. The authors introduce a new dataset with 5000 images and 37,993 relationships, demonstrating that their model outperforms previous methods in visual relationship detection and zero-shot learning. Additionally, they show that understanding visual relationships can improve content-based image retrieval. The paper includes detailed experiments, ablation studies, and qualitative results to support the effectiveness of their approach.

Visual Relationship Detection with Language Priors

31 Jul 2016 | Cewu Lu*, Ranjay Krishna*, Michael Bernstein, Li Fei-Fei

31 Jul 2016 | Cewu Lu, Ranjay Krishna, Michael Bernstein, Li Fei-Fei