ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

2 Jul 2024 | Zhiyuan Ma, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang
ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation proposes a novel method, Asynchronous Score Distillation (ASD), to synthesize 3D content from text prompts. Unlike previous methods that require extensive optimization per prompt, ASD leverages the pre-trained 2D diffusion model to minimize noise prediction error by shifting the diffusion timestep to earlier stages. This approach avoids modifying the pre-trained diffusion model, preserving its strong text comprehension capability and enabling stable training with high-quality 3D content generation. ASD is tested across various 2D diffusion models (e.g., Stable Diffusion, MVDream) and 3D generators (e.g., Hyper-iNGP, 3DConv-Net, Triplane-Transformer), demonstrating superior performance in prompt consistency and scalability, even with large prompt corpora (up to 100k prompts). The method achieves stable training, high-quality 3D synthesis, and effective alignment with text prompts, outperforming existing score distillation approaches in terms of efficiency and stability. ASD's key innovation lies in its ability to shift the diffusion timestep without altering the pre-trained model, enabling efficient and scalable text-to-3D generation.ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation proposes a novel method, Asynchronous Score Distillation (ASD), to synthesize 3D content from text prompts. Unlike previous methods that require extensive optimization per prompt, ASD leverages the pre-trained 2D diffusion model to minimize noise prediction error by shifting the diffusion timestep to earlier stages. This approach avoids modifying the pre-trained diffusion model, preserving its strong text comprehension capability and enabling stable training with high-quality 3D content generation. ASD is tested across various 2D diffusion models (e.g., Stable Diffusion, MVDream) and 3D generators (e.g., Hyper-iNGP, 3DConv-Net, Triplane-Transformer), demonstrating superior performance in prompt consistency and scalability, even with large prompt corpora (up to 100k prompts). The method achieves stable training, high-quality 3D synthesis, and effective alignment with text prompts, outperforming existing score distillation approaches in terms of efficiency and stability. ASD's key innovation lies in its ability to shift the diffusion timestep without altering the pre-trained model, enabling efficient and scalable text-to-3D generation.
Reach us at info@study.space
Understanding ScaleDreamer%3A Scalable Text-to-3D Synthesis with Asynchronous Score Distillation