**DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling**
**Authors:** Haoran Li, Haolin Shi, Wenli Zhang, Wenjun Wu, Yong Liao, Lin Wang, Lik-Hang Lee, and Peng Yuan Zhou
**Institution:** University of Science and Technology of China, CCCD Key Lab of Ministry of Culture and Tourism, AI Thrust, HKUST(GZ), Dept. of Computer Science Eng., HKUST, The Hong Kong Polytechnic University, Aarhus University
**Abstract:**
Text-to-3D scene generation holds significant potential for gaming, film, and architecture. However, existing methods struggle with maintaining high quality, consistency, and editing flexibility. DreamScene is a novel 3D Gaussian-based text-to-3D scene generation framework that addresses these challenges through two main strategies. First, it employs Formation Pattern Sampling (FPS), a multi-timestep sampling strategy guided by 3D object formation patterns, to generate fast, semantically rich, and high-quality representations. FPS uses 3D Gaussian filtering for optimization stability and reconstruction techniques to generate plausible textures. Second, DreamScene employs a progressive three-stage camera sampling strategy, specifically designed for both indoor and outdoor settings, to ensure object and environment integration and scene-wide 3D consistency. Additionally, DreamScene enhances scene editing flexibility by integrating objects and environments, enabling targeted adjustments. Extensive experiments validate DreamScene's superiority over current state-of-the-art techniques, highlighting its wide-ranging potential for diverse applications.
**Keywords:**
Text-to-3D, Text-to-3D Scene, 3D Gaussian, Scene Generation, Scene Editing
**Introduction:**
Text-to-3D scene generation has evolved significantly, broadening its scope from simple objects to detailed, complex scenes. However, existing methods face challenges such as low-quality outputs, long completion times, inconsistent 3D visual cues, and inability to separate objects from environments. DreamScene introduces a novel framework that leverages Formation Pattern Sampling (FPS) and strategic camera sampling to address these issues. FPS uses multi-timestep sampling, 3D Gaussian filtering, and reconstructive generation to produce high-quality, semantically rich 3D representations. The three-stage camera sampling strategy ensures 3D consistency by generating coarse representations, adapting ground formation, and consolidating the scene through reconstructive generation. DreamScene also enhances scene editing flexibility by integrating objects and environments.
**Contributions:**
- DreamScene: A novel framework for text-driven 3D scene generation, leveraging FPS, strategic camera sampling, and seamless object-environment integration.
- Formation Pattern Sampling: A central method that combines multi-timestep sampling, 3D Gaussian filtering, and reconstructive generation to produce high-quality, semantically rich 3D representations.
- Experiments: DreamScene outperforms existing methods in text-driven 3D object and scene generation, demonstrating substantial potential for various applications.
**Related**DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling**
**Authors:** Haoran Li, Haolin Shi, Wenli Zhang, Wenjun Wu, Yong Liao, Lin Wang, Lik-Hang Lee, and Peng Yuan Zhou
**Institution:** University of Science and Technology of China, CCCD Key Lab of Ministry of Culture and Tourism, AI Thrust, HKUST(GZ), Dept. of Computer Science Eng., HKUST, The Hong Kong Polytechnic University, Aarhus University
**Abstract:**
Text-to-3D scene generation holds significant potential for gaming, film, and architecture. However, existing methods struggle with maintaining high quality, consistency, and editing flexibility. DreamScene is a novel 3D Gaussian-based text-to-3D scene generation framework that addresses these challenges through two main strategies. First, it employs Formation Pattern Sampling (FPS), a multi-timestep sampling strategy guided by 3D object formation patterns, to generate fast, semantically rich, and high-quality representations. FPS uses 3D Gaussian filtering for optimization stability and reconstruction techniques to generate plausible textures. Second, DreamScene employs a progressive three-stage camera sampling strategy, specifically designed for both indoor and outdoor settings, to ensure object and environment integration and scene-wide 3D consistency. Additionally, DreamScene enhances scene editing flexibility by integrating objects and environments, enabling targeted adjustments. Extensive experiments validate DreamScene's superiority over current state-of-the-art techniques, highlighting its wide-ranging potential for diverse applications.
**Keywords:**
Text-to-3D, Text-to-3D Scene, 3D Gaussian, Scene Generation, Scene Editing
**Introduction:**
Text-to-3D scene generation has evolved significantly, broadening its scope from simple objects to detailed, complex scenes. However, existing methods face challenges such as low-quality outputs, long completion times, inconsistent 3D visual cues, and inability to separate objects from environments. DreamScene introduces a novel framework that leverages Formation Pattern Sampling (FPS) and strategic camera sampling to address these issues. FPS uses multi-timestep sampling, 3D Gaussian filtering, and reconstructive generation to produce high-quality, semantically rich 3D representations. The three-stage camera sampling strategy ensures 3D consistency by generating coarse representations, adapting ground formation, and consolidating the scene through reconstructive generation. DreamScene also enhances scene editing flexibility by integrating objects and environments.
**Contributions:**
- DreamScene: A novel framework for text-driven 3D scene generation, leveraging FPS, strategic camera sampling, and seamless object-environment integration.
- Formation Pattern Sampling: A central method that combines multi-timestep sampling, 3D Gaussian filtering, and reconstructive generation to produce high-quality, semantically rich 3D representations.
- Experiments: DreamScene outperforms existing methods in text-driven 3D object and scene generation, demonstrating substantial potential for various applications.
**Related