[slides] Open-Universe Indoor Scene Generation using LLM Program Synthesis and Uncurated Object Databases

The paper presents a system for generating 3D indoor scenes in response to open-ended text prompts, without being restricted to a fixed set of room types or object categories. Unlike previous methods that require large datasets of existing 3D scenes, this system leverages pre-trained large language models (LLMs) to synthesize programs in a domain-specific layout language, which describe objects and their spatial relations. These programs are then executed using a gradient-based optimization scheme to produce object positions and orientations. The system also uses vision-language models (VLMs) to retrieve 3D meshes from uncurated, inconsistently aligned databases, ensuring high-quality object geometry. Experimental results show that the system outperforms both closed-universe scene generation methods and a recent LLM-based layout generation method in generating diverse and realistic indoor scenes. The contributions of the paper include a declarative domain-specific language for specifying indoor scene layouts, a robust prompting workflow leveraging LLMs, and a pipeline for retrieving and orienting 3D meshes from large, uncurated databases.The paper presents a system for generating 3D indoor scenes in response to open-ended text prompts, without being restricted to a fixed set of room types or object categories. Unlike previous methods that require large datasets of existing 3D scenes, this system leverages pre-trained large language models (LLMs) to synthesize programs in a domain-specific layout language, which describe objects and their spatial relations. These programs are then executed using a gradient-based optimization scheme to produce object positions and orientations. The system also uses vision-language models (VLMs) to retrieve 3D meshes from uncurated, inconsistently aligned databases, ensuring high-quality object geometry. Experimental results show that the system outperforms both closed-universe scene generation methods and a recent LLM-based layout generation method in generating diverse and realistic indoor scenes. The contributions of the paper include a declarative domain-specific language for specifying indoor scene layouts, a robust prompting workflow leveraging LLMs, and a pipeline for retrieving and orienting 3D meshes from large, uncurated databases.