Neural Motifs: Scene Graph Parsing with Global Context

Neural Motifs: Scene Graph Parsing with Global Context

29 Mar 2018 | Rowan Zellers, Mark Yatskar, Sam Thomson, Yejin Choi
The paper "Neural Motifs: Scene Graph Parsing with Global Context" by Rowan Zellers, Mark Yatskar, Sam Thomson, and Yejin Choi investigates the problem of generating structured graph representations of visual scenes. The authors analyze the role of motifs—repeated substructures in scene graphs—and present new quantitative insights using the Visual Genome dataset. They find that object labels are highly predictive of relation labels, and over 50% of graphs contain motifs involving at least two relations. Based on these findings, they introduce a baseline model that predicts the most frequent relation between object pairs given object detections, improving over previous state-of-the-art models by 3.6%. They then propose Stacked Motif Networks, a new architecture designed to capture higher-order motifs, which further improves performance by 7.1%. The code for their approach is available on GitHub.The paper "Neural Motifs: Scene Graph Parsing with Global Context" by Rowan Zellers, Mark Yatskar, Sam Thomson, and Yejin Choi investigates the problem of generating structured graph representations of visual scenes. The authors analyze the role of motifs—repeated substructures in scene graphs—and present new quantitative insights using the Visual Genome dataset. They find that object labels are highly predictive of relation labels, and over 50% of graphs contain motifs involving at least two relations. Based on these findings, they introduce a baseline model that predicts the most frequent relation between object pairs given object detections, improving over previous state-of-the-art models by 3.6%. They then propose Stacked Motif Networks, a new architecture designed to capture higher-order motifs, which further improves performance by 7.1%. The code for their approach is available on GitHub.
Reach us at info@study.space
[slides] Neural Motifs%3A Scene Graph Parsing with Global Context | StudySpace