Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

2024-06-12 | Hanwen Jiang, Qixing Huang, Georgios Pavlakos
Real3D is the first large reconstruction model (LRM) that can be trained using single-view real-world images. Unlike previous methods that rely on synthetic data or multi-view images, Real3D introduces a novel self-training framework that combines synthetic data with real-world images to improve performance. The model uses two unsupervised losses: a pixel-level cycle consistency loss and a semantic-level loss, which help supervise the model even when ground-truth data is not available. To further enhance performance, Real3D employs an automatic data curation method to select high-quality examples from real-world images. This approach allows the model to learn from a broader range of data, improving its ability to generalize and reconstruct real-world objects. Real3D outperforms prior work in four diverse evaluation settings, including real and synthetic data, as well as both in-domain and out-of-domain shapes. The model is trained on a combination of synthetic data and real-world images, with the latter being curated to ensure quality. Real3D demonstrates superior performance, effective use of real data, and scalability, making it a promising approach for 3D reconstruction. The model is evaluated on a variety of datasets, including real-world and synthetic data, and shows consistent improvements across different scenarios. Real3D's self-training framework and data curation method enable it to learn from a wide range of data, leading to better reconstruction quality and generalization. The model is also compared with other state-of-the-art methods, showing its effectiveness in various tasks. Overall, Real3D represents a significant advancement in 3D reconstruction by leveraging real-world data and improving the model's ability to generalize and reconstruct complex shapes.Real3D is the first large reconstruction model (LRM) that can be trained using single-view real-world images. Unlike previous methods that rely on synthetic data or multi-view images, Real3D introduces a novel self-training framework that combines synthetic data with real-world images to improve performance. The model uses two unsupervised losses: a pixel-level cycle consistency loss and a semantic-level loss, which help supervise the model even when ground-truth data is not available. To further enhance performance, Real3D employs an automatic data curation method to select high-quality examples from real-world images. This approach allows the model to learn from a broader range of data, improving its ability to generalize and reconstruct real-world objects. Real3D outperforms prior work in four diverse evaluation settings, including real and synthetic data, as well as both in-domain and out-of-domain shapes. The model is trained on a combination of synthetic data and real-world images, with the latter being curated to ensure quality. Real3D demonstrates superior performance, effective use of real data, and scalability, making it a promising approach for 3D reconstruction. The model is evaluated on a variety of datasets, including real-world and synthetic data, and shows consistent improvements across different scenarios. Real3D's self-training framework and data curation method enable it to learn from a wide range of data, leading to better reconstruction quality and generalization. The model is also compared with other state-of-the-art methods, showing its effectiveness in various tasks. Overall, Real3D represents a significant advancement in 3D reconstruction by leveraging real-world data and improving the model's ability to generalize and reconstruct complex shapes.
Reach us at info@study.space
Understanding Real3D%3A Scaling Up Large Reconstruction Models with Real-World Images