4D Panoptic Scene Graph Generation

4D Panoptic Scene Graph Generation

16 May 2024 | Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu
The paper introduces 4D Panoptic Scene Graph (PSG-4D), a novel representation that captures both spatial and temporal information in dynamic environments. PSG-4D aims to bridge raw visual data from RGB-D videos with high-level visual understanding, enabling comprehensive scene comprehension. The authors develop a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with 1M frames, labeled with 4D panoptic segmentation masks and dynamic scene graphs. To address this task, they propose PSG4DFormer, a Transformer-based model that predicts panoptic segmentation masks, tracks objects over time, and generates scene graphs. Extensive experiments on the new dataset demonstrate the effectiveness of the proposed method. The paper also explores real-world applications, such as integrating PSG-4D with large language models to enable dynamic scene understanding and robot interaction. The work highlights the potential of PSG-4D in advancing the field of 4D scene graph generation and its practical implications in robotics and autonomous systems.The paper introduces 4D Panoptic Scene Graph (PSG-4D), a novel representation that captures both spatial and temporal information in dynamic environments. PSG-4D aims to bridge raw visual data from RGB-D videos with high-level visual understanding, enabling comprehensive scene comprehension. The authors develop a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with 1M frames, labeled with 4D panoptic segmentation masks and dynamic scene graphs. To address this task, they propose PSG4DFormer, a Transformer-based model that predicts panoptic segmentation masks, tracks objects over time, and generates scene graphs. Extensive experiments on the new dataset demonstrate the effectiveness of the proposed method. The paper also explores real-world applications, such as integrating PSG-4D with large language models to enable dynamic scene understanding and robot interaction. The work highlights the potential of PSG-4D in advancing the field of 4D scene graph generation and its practical implications in robotics and autonomous systems.
Reach us at info@study.space