DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

25 Jul 2024 | Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, and Achuta Kadambi
DreamScene360 is a novel text-to-3D scene generation framework that creates immersive 360-degree scenes from text prompts. The method leverages a 2D diffusion model to generate panoramic images, which are then refined to enhance visual quality and text-image alignment. These images are transformed into 3D Gaussians using a geometric field and semantic alignment, enabling real-time exploration. The pipeline constructs a spatially coherent structure by aligning 2D monocular depth into a globally optimized point cloud, which serves as the initial state for 3D Gaussians. Semantic and geometric constraints are imposed to address issues inherent in single-view inputs, ensuring the reconstruction of unseen regions. The method provides a globally consistent 3D scene with enhanced immersive experience. DreamScene360 uses panoramic images as an intermediate input for globally consistent scenes, enabling the creation of immersive 3D environments from simple user commands. The framework offers a novel solution to the demand for high-quality 3D scenes, reducing reliance on extensive manual effort. The method is evaluated against baseline methods, demonstrating superior performance in terms of global consistency and visual quality. The results show that DreamScene360 provides complete 360-degree coverage without blind spots, with globally consistent semantics, styles, and geometry. The framework is versatile and can adapt to other text-to-panorama diffusion models. The method is supported by various references and has been tested on diverse scenarios, including indoor and outdoor environments. The results demonstrate the effectiveness of the proposed approach in generating high-quality 3D scenes with complete 360-degree coverage.DreamScene360 is a novel text-to-3D scene generation framework that creates immersive 360-degree scenes from text prompts. The method leverages a 2D diffusion model to generate panoramic images, which are then refined to enhance visual quality and text-image alignment. These images are transformed into 3D Gaussians using a geometric field and semantic alignment, enabling real-time exploration. The pipeline constructs a spatially coherent structure by aligning 2D monocular depth into a globally optimized point cloud, which serves as the initial state for 3D Gaussians. Semantic and geometric constraints are imposed to address issues inherent in single-view inputs, ensuring the reconstruction of unseen regions. The method provides a globally consistent 3D scene with enhanced immersive experience. DreamScene360 uses panoramic images as an intermediate input for globally consistent scenes, enabling the creation of immersive 3D environments from simple user commands. The framework offers a novel solution to the demand for high-quality 3D scenes, reducing reliance on extensive manual effort. The method is evaluated against baseline methods, demonstrating superior performance in terms of global consistency and visual quality. The results show that DreamScene360 provides complete 360-degree coverage without blind spots, with globally consistent semantics, styles, and geometry. The framework is versatile and can adapt to other text-to-panorama diffusion models. The method is supported by various references and has been tested on diverse scenarios, including indoor and outdoor environments. The results demonstrate the effectiveness of the proposed approach in generating high-quality 3D scenes with complete 360-degree coverage.
Reach us at info@study.space
Understanding DreamScene360%3A Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting