MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

14 Jun 2024 | Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang
MeshAnything is a novel method for generating Artist-Created Meshes (AMs) from various 3D representations. It addresses the issue that current mesh extraction methods produce meshes with poor topology quality compared to AMs, which are manually crafted by human artists. MeshAnything treats mesh extraction as a generation problem, using a VQ-VAE and a shape-conditioned decoder-only transformer to generate AMs aligned with specified shapes. The model can convert 3D assets into AMs, enabling seamless integration with various 3D asset production pipelines. The architecture of MeshAnything includes a VQ-VAE to learn a mesh vocabulary and a shape-conditioned decoder-only transformer trained on this vocabulary for mesh generation. The model is trained using point clouds derived from AMs, with the shape condition injected into the decoder to enhance mesh generation quality. The model also includes a noise-resistant decoder to improve robustness against low-quality token sequences. Extensive experiments show that MeshAnything generates AMs with significantly fewer faces and more refined topology, achieving precision comparable to previous methods. The method is effective in reducing the number of faces and improving the efficiency of mesh generation, making it suitable for the 3D industry. The model is trained on a combined dataset of Objaverse and ShapeNet, and it outperforms existing methods in terms of mesh quality and efficiency. The method has potential applications in the 3D industry, including gaming, film, and the metaverse, but may also raise concerns about potential criminal activities due to the reduced cost of obtaining 3D artist-created meshes.MeshAnything is a novel method for generating Artist-Created Meshes (AMs) from various 3D representations. It addresses the issue that current mesh extraction methods produce meshes with poor topology quality compared to AMs, which are manually crafted by human artists. MeshAnything treats mesh extraction as a generation problem, using a VQ-VAE and a shape-conditioned decoder-only transformer to generate AMs aligned with specified shapes. The model can convert 3D assets into AMs, enabling seamless integration with various 3D asset production pipelines. The architecture of MeshAnything includes a VQ-VAE to learn a mesh vocabulary and a shape-conditioned decoder-only transformer trained on this vocabulary for mesh generation. The model is trained using point clouds derived from AMs, with the shape condition injected into the decoder to enhance mesh generation quality. The model also includes a noise-resistant decoder to improve robustness against low-quality token sequences. Extensive experiments show that MeshAnything generates AMs with significantly fewer faces and more refined topology, achieving precision comparable to previous methods. The method is effective in reducing the number of faces and improving the efficiency of mesh generation, making it suitable for the 3D industry. The model is trained on a combined dataset of Objaverse and ShapeNet, and it outperforms existing methods in terms of mesh quality and efficiency. The method has potential applications in the 3D industry, including gaming, film, and the metaverse, but may also raise concerns about potential criminal activities due to the reduced cost of obtaining 3D artist-created meshes.
Reach us at info@study.space